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but international search fee (37 CFR 1.445(a)(2)) paid to USPTO S710.00 

International preliminary examination fee (37 CFR 1.482) paid to USPTO 
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0.00 



CLAIMS 



NUMBER FILED 
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RATE 



Total claims 
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0 



x $18.00 



0.00 



ndependent claims 



x $80.00 



160.00 



MULTIPLE DEPENDENT CLAIM(S) (if applicable) 



+ $270.00 



0.00 



TOTAL OF ABOVE CALCULATIONS 



1,020.00 



^ Applicant claims small entity status. See 37 CFR 1.27. The fees indicated above 
W- are reduced by 1/2. 



0.00 



SUBTOTAL = 



1,02Q T QQ 



Professing fee of $130.00 for furnishing the English translation later than Q 20 f~l 30 
npnths from the earliest claimed priority date (37 CFR 1.492(f)). 



0.00 



TOTAL NATIONAL FEE = 



i,o?o T ()0 



Fsse for recording the enclosed assignment (37 CFR 1.21(h)). The assignment must be 
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40.00 



TOTAL FEES ENCLOSED = 
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charged: 
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TRANSMITTAL LETTER TO THE 
UNITED STATES RECEIVING OFFICE 



jpifttWti PC17PT0 2 7 APR 2 001 



Date 



International Application No, 



Attorney Docket No. 



April 27, 



PCT/EP9 



83051 4 



Mo-6305/HR-199 



I. 



Certification under 37 CFR 1.10 (if applicable) 



ET146893673US 



Express Mad mailing number 



April 27, 2001 



Date of Deposit 



[ hereby certify that the application/correspondence attached hereto is being deposited with the United States Postal Service "Express Mail Post Office to 
Addressee' 1 service under 37 CFR 1 10 on the date indicated above and is addressed to Assistant Commissioner for Patents, Washington, D.C 20231. 



Signature of person mailing correspondence 



Donna J. Veatch 



Typed or printed name of person mailing correspondence 



11. 



New International Application 



TITLE 



CONSTRUCTION OF PRODUCTION STRAINS FOR PRODUCING 
SUBSTITUTED PHENOLS BY SPECIFICALLY INACTIVATING 
GENES OF THE EUGENOL AND FERULIC ACID CATABOLISM 



Earliest priority date 
(Day/Mon/Year) 



(31/10/98) 



A, 



SCREENING DISCLOSURE INFORMATION: In order to assist in screening the accompanying international 
application for purposes of determining whether a license for foreign transmittal should and could be granted and for 
other purposes, the following information is supplied. (Note: check as many boxes as apply): 



X The invention disclosed was not made in the United States. 



B. There is no prior U.S. application relating to this invention. 

Q t 1 I The following prior U.S. application(s) contain subject matter which is related to the invention disclosed in the 
' — attached international application. (NOTE, priority to these applications may or may not be claimed on form 
PCT/RO/I0I (Request) and this listing does not constitute a claim for priority.) 



application no. 




filed on 




application no. 




filed on 





LJ The present international application! Icontains additional subject matte r not found in the prior U.S. application(s) identified 

in paragraph C. above. The additional subject matter is found on pages 



and Q DOES NOT ALTER CU MIGHT BE CONSIDERED TO ALTER the general nature of the invention in a 
manner which would require the U.S. application to have been made available for inspection by the appropriate 
defense agencies under 35 U.S.C. 181 and 37 CFR 5.1. See 37 CFR 5.15 



IIC? 



□ A Response to an Invitation from the RO/US. The following document(s) is(are) enclosed: 

A. □ A Request for An Extension of Time to File a Response 

B. □ A Power of Attorney (Genera! or Regular) 
c. □ Replacement pages: 



D. □ 



pages 




of the request (PCT/RO/101) 


pages 




of the figures 


pages 




of the description 


pages 




of the abstract 


pages 




of the claims 





Submission of Priority Documents 



Priority document 




Priority document 





E. 1 1 Fees as specified on attached Fee Calculation sheet form PCT/RO/101 annex 



IV. 



□ 



A Request for Rectification under PCT 91 



□ 



A Petition 



□ 



A Sequence Listing Diskette 



V. 00 Other (please specify): Preliminary Amendment w/ Abstract, Sequence Listing (Paper and Disk Copy) 

Form PTO 1449 w/references 
Drawings (3 sheets) 



The person 
signing this 
form is the: 



1 1 Applicant 


r>fofem^J. Cheung^-^ 


r— | Attorney/Agent (Reg. No.) 
^ 39,138 


\ ^X^^ \~~ Typed name of signer 




1 1 Common Representative 


J \ Signature | 
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PATENT APPLICATION 

Mo-6305 

HR-199 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



APPLICATION OF 

JORGEN RABENHORST, ET AL. 

SERIAL NUMBER: TO BE ASSIGNED 

FILED: HEREWITH 

TITLE: CONSTRUCTION OF 

PRODUCTION STRAINS FOR 
PRODUCING SUBSTITUTED 
PHENOLS BY SPECIFICALLY 
INACTIVATING GENES OF THE 
EUGENOL AND FERULIC ACID 
CATABOLISM 



PCT/EP99/07952 



PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 

Washington, D.C. 20231 

Sir: 

Upon the granting of a Serial Number and Filing Date and prior to the 
examination of the subject application, kindly amend the Specification and Claims as 
follows: 



"Express Mail" mailing label number 

Date of Deposit April 27, 2301 



) hereby certify that this paper or fee is being deposited with the United States 
Postal Service "Express Mail Post Office to Addressee" service under 37 CFR 
1 .10 on the date indicated above and is addressed to the Assistant Commissioner 
of Patents and Trademarks, Washington, D.C. 20231 

Donna J . Veat ch 

mailing pape 
i marling paper or fee) 




IN THE SPECIFICATION: 



Kindly replace the Title of the Invention with the following: 

CONSTRUCTION OF PRODUCTION STRAINS FOR PRODUCING 
SUBSTITUTED PHENOLS BY SPECIFICALLY INACTIVATING GENES OF THE 
EUGENOL AND FERULIC ACID CATABOLISM --. 

Kindly insert the following "ABSTRACT" page 

- The present invention relates to a transformed and/or mutagenated 
unicellular or multicellular organism which is characterized in that enzymes 
of the eugenol and/or ferulic acid catabolism are deactivated in such a 
manner that the intermediates coniferyl alcohol, coniferyl aldehyde, ferulic 
acid, vanillin and/or vanillinic acid are accumulated. - 

On page 1 , line 4, kindly insert the following: 

- FIELD OF THE INVENTION ~. 

On page 1 , line 7, kindly insert the following: 
-BACKGROUND OF THE INVENTION--. 

On page 2, after line 9, kindly insert the following: 

- BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1a to 1r show gene structures for isolating organisms and 
mutants. 

FIG. 2a: shows a nucleotide sequence of the calAQKm gene structure 
(SEQ ID NO: 1). 

FIG. 2b: shows a nucleotide sequence of the calAQGm gene structure 
(SEQ ID NO: 2). 

FIG. 2c: shows a nucleotide sequence of the calA^ gene structure 
(SEQ ID NO: 3). 

FIG. 2d: shows a nucleotide sequence of the ca/B^2Km gene structure 
(SEQ ID NO: 4). 
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FIG. 2e: shows a nucleotide sequence of the ca/BQGm gene structure 
(SEQ ID NO: 5). 

FIG. 2f: shows a nucleotide sequence of the ca!B& gene structure 
(SEQ ID NO: 6). 

FIG. 2g: shows a nucleotide sequence of the fcsQKm gene structure 
(SEQ ID NO: 7). 

FIG. 2h: shows a nucleotide sequence of the fcsQGm gene structure 
(SEQ ID NO: 8). 

FIG. 2i: shows a nucleotide sequence of the fcs& gene structure (SEQ 
ID NO: 9). 

FIG. 2j: shows a nucleotide sequence of the echQKm gene structure 
(SEQ ID NO: 10). 

FIG. 2k: shows a nucleotide sequence of the ec/iQGm gene structure 
(SEQ ID NO: 11). 

FIG. 21: shows a nucleotide sequence of the ech& gene structure 
(SEQ ID NO: 12). 

FIG. 2m: shows a nucleotide sequence of the vdhQKm gene structure 
(SEQ ID NO: 13). 

FIG. 2n: shows a nucleotide sequence of the vdhQGm gene structure 
(SEQ ID NO: 14). 

FIG. 2o: shows a nucleotide sequence of the vdh^ gene structure 
(SEQ ID NO: 15). 

FIG. 2p: shows a nucleotide sequence of the aatoKm gene structure 
(SEQ ID NO: 16). 

FIG. 2q: shows a nucleotide sequence of the aatciGm gene structure 
(SEQ ID NO: 17). 

FIG. 2r: shows a nucleotide sequence of the aaf^ gene structure 
(SEQ ID NO: 18). --. 

On page 2, line 10, kindly insert the following: 
--SUMMARY OF THE INVENTION--. 
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On page 2, line 19, kindly insert the following: 

-DETAILED DESCRIPTION OF THE INVENTION-. 

IN THE CLAIMS : 

Kindly cancel Claims 1-16. 
Kindly add the following new claims: 

- 17. Transformed and/or mutagenized unicellular or multicellular organism 
comprising enzymes of eugenol and/or ferulic acid catabolism which are inactivated 
such that the intermediates coniferyl alcohol, coniferyl aldehyde, ferulic acid, vanillin 
and/or vanillic acid accumulate. 

18. An organism according to Claim 17, wherein eugenol and/or ferulic 
acid catabolism is altered by inserting fi elements, or introducing deletions, into 
corresponding genes. 

1 9. Organism according to Claim 1 7, wherein one or more genes encoding 
the enzymes coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 
feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 
vanillin dehydrogenases or vanillic acid demethylases is/are altered and/or 
inactivated. 

20. An organism according to Claim 17, wherein said organism is 
unicellular. 

21 . An organism according to Claim 20, wherein said organism is selected 
from a group consisting of a microorganism, a plant or animal cell. 

22. An organism according to Claim 1 7, wherein said organism is a 
bacterium. 
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23. An organism according to Claim 22, wherein said organism is of the 
Pseudomonas species. 

24. Gene structures comprising nucleotide sequences which encode the 
enzymes coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 
feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 
vanillin-dehydrogenases or vanillic acid demethylases, or two or more of these 
enzymes, and are altered and/or inactivated. 

25. Gene structures having the sequences corresponding to SEQ ID NO:1 
to SEQ ID NO: 18. 

26. Vectors comprising at least one gene structure having the sequences 
corresponding to SEQ ID NO:1 to SEQ ID NO: 18. 

27. A transformed organism according to Claim 17, wherein said organism 
comprises at least one vector comprising at least one gene structure having the 
sequences corresponding to SEQ ID NO:1 to SEQ ID NO: 18. 

28. Organism according to Claim 17, wherein said organism comprises at 
least one gene structure having the sequences corresponding to SEQ ID NO:1 to 
SEQ ID NO: 18 which is integrated into the genome instead of the respective intact 
gene. 

29. Process for the biotechnological preparation of alcohols, aldehydes 
and organic acids, comprising the step of adding an organism comprising enzymes 
of eugenol and/or ferulic acid catabolism which are inactivated such that the 
intermediates coniferyl alcohol, coniferyl aldehyde, ferulic acid, vanillin and/or vanillic 
acid accumulate. 
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30. Process for preparing an organism according to Claim 17, wherein the 
alteration eugenol and/or ferulic acid cataboiism is achieved by microbiological 
culturing methods. 

31 . Process for preparing an organism according to Claim 29, wherein the 
alteration in eugenol and/or ferulic acid cataboiism, and/or the inactivation of the 
corresponding genes, is achieved by means of recombinant DNA methods. 
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REMARKS 

The Applicants respectfully request the Preliminary Amendment be entered 
as the amendment places the claims as well as the Specification in proper form. 

New Claims 17-31 replace now cancelled Claims 1-16. Support for the 
new claims are found in the respective original cancelled claims. The Applicants 
respectfully submit that no new matter is added. 

Additionally, the Applicants hereby submit a paper copy of the "Sequence 
Listing" as well as a copy of the "Sequence Listing" in computer readable form. The 
"Sequence Listing" has been amended to place it in proper form for U.S. filing. The 
Applicants also state that the information recorded in computer readable form is 
identical to the written sequence listing. 

The attached page is captioned " VERSION WITH MARKINGS TO SHOW 
CHANGES MADE". 



Respectfully submitted, 




Attorney for Applicants 
Reg. No. 39,138 

Bayer Corporation 
1 00 Bayer Road 

Pittsburgh, Pennsylvania 15205-9741 
(412) 777-8338 

FACSIMILE PHONE NUMBER: 
(412) 777-8363 
s:\ksl\NJC1008 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 

IN THE SPECIFICATION: 

Kindly replace the Title of the Invention with the following: 

-- CONSTRUCTION OF PRODUCTION STRAINS FOR PRODUCING 
SUBSTITUTED PHENOLS BY SPECIFICALLY INACTIVATING GENES OF THE 
EUGENOL AND FERULIC ACID CATABOLISM --. 

Kindly insert the following "ABSTRACT" page 

- The present invention relates to a transformed and/or mutagenated 
unicellular or multicellular organism which is characterized in that enzymes 
of the eugenol and/or ferulic acid catabolism are deactivated in such a 
manner that the intermediates coniferyl alcohol, coniferyl aldehyde, ferulic 
acid, vanillin and/or vanillinic acid are accumulated. -- 

On page 1 , line 4, kindly insert the following: 
FIELD OF THE INVENTION ~. 

On page 1, line 7, kindly insert the following: 
-BACKGROUND OF THE INVENTION--. 

On page 2, after line 9, kindly insert the following: 
-- BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1a to 1r show gene structures for isolating organisms and 
mutants. 

FIG. 2a: shows a nucleotide sequence of the calAnKm gene structure 
(SEQ ID NO: 1). 

FIG. 2b: shows a nucleotide sequence of the calAoGm gene structure 
(SEQ ID NO: 2). 

FIG. 2c: shows a nucleotide sequence of the calA& gene structure 
(SEQ ID NO: 3). 



Mo-6305 



-8- 



FIG. 2d: shows a nucleotide sequence of the ca/BoKm gene structure 
(SEQ ID NO: 4). 

FIG. 2e: shows a nucleotide sequence of the calBnGm gene structure 
(SEQ ID NO: 5). 

FIG. 2f: shows a nucleotide sequence of the calB^ gene structure 
(SEQ ID NO: 6). 

FIG. 2g: shows a nucleotide sequence of the fcsQKm gene structure 
(SEQ ID NO: 7). 

FIG. 2h: shows a nucleotide sequence of the fcsQGm gene structure 
(SEQ ID NO: 8). 

FIG. 2i: shows a nucleotide sequence of the fcs^ gene structure (SEQ 
ID NO: 9). 

FIG. 2j: shows a nucleotide sequence of the echQKm gene structure 
(SEQ ID NO: 10). 

FIG. 2k: shows a nucleotide sequence of the ecfrQGm gene structure 
(SEQ ID NO: 11). 

FIG. 21: shows a nucleotide sequence of the ech& gene structure 
(SEQ ID NO: 12). 

FIG. 2m: shows a nucleotide sequence of the vdhQKm gene structure 
(SEQ ID NO: 13). 

FIG. 2n: shows a nucleotide sequence of the vdhnGm gene structure 
(SEQ ID NO: 14). 

FIG. 2o: shows a nucleotide sequence of the vdhA gene structure 
(SEQ ID NO: 15). 

FIG. 2p: shows a nucleotide sequence of the aatQKm gene structure 
(SEQ ID NO: 16). 

FIG. 2q: shows a nucleotide sequence of the aafQGm gene structure 
(SEQ ID NO: 17). 

FIG. 2r: shows a nucleotide sequence of the aaf^ gene structure 
(SEQ ID NO: 18). -. 
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On page 2, line 1 0, kindly insert the following: 
--SUMMARY OF THE INVENTION--. 

On page 2, line 1 9, kindly insert the following: 

-DETAILED DESCRIPTION OF THE INVENTION--. 

IN THE CLAIMS : 

Kindly cancel Claims 1-16. 
Kindly add the following new claims: 

— 17. Transformed and/or mutagenized unicellular or multicellular organism 
comprising enzymes of eugenol and/or ferulic acid catabolism which are inactivated 
such that the intermediates coniferyl alcohol, coniferyl aldehyde, ferulic acid, vanillin 
and/or vanillic acid accumulate. 

18. An organism according to Claim 17, wherein eugenol and/or ferulic 
acid catabolism is altered by inserting Q elements, or introducing deletions, into 
corresponding genes. 

19. Organism according to Claim 17, wherein one or more genes encoding 
the enzymes coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 
feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 
vanillin dehydrogenases or vanillic acid demethylases is/are altered and/or 
inactivated. 

20. An organism according to Claim 17, wherein said organism is 
unicellular. 

21 . An organism according to Claim 20, wherein said organism is selected 
from a group consisting of a microorganism, a plant or animal cell. 
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22. An organism according to Claim 17, wherein said organism is a 
bacterium. 

23. An organism according to Claim 22, wherein said organism is of the 
Pseudomonas species. 

24. Gene structures comprising nucleotide sequences which encode the 
enzymes coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 
feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 
vanillin-dehydrogenases or vanillic acid demethylases, or two or more of these 
enzymes, and are altered and/or inactivated. 

25. Gene structures having the sequences corresponding to SEQ ID NO:1 
toSEQIDNO:18. 

26. Vectors comprising at least one gene structure having the sequences 
corresponding to SEQ ID NO:1 to SEQ ID NO: 18. 

27. A transformed organism according to Claim 17, wherein said organism 
comprises at least one vector comprising at least one gene structure having the 
sequences corresponding to SEQ ID NO:1 to SEQ ID NO: 18. 

28. Organism according to Claim 17, wherein said organism comprises at 
least one gene structure having the sequences corresponding to SEQ ID NO:1 to 
SEQ ID NO: 18 which is integrated into the genome instead of the respective intact 
gene. 

29. Process for the biotechnological preparation of alcohols, aldehydes 
and organic acids, comprising the step of adding an organism comprising enzymes 
of eugenol and/or ferulic acid catabolism which are inactivated such that the 
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intermediates coniferyi alcohol, coniferyl aldehyde, ferulic acid, vanillin and/or vanillic 
acid accumulate. 

30. Process for preparing an organism according to Claim 17, wherein the 
alteration eugenol and/or ferulic acid catabolism is achieved by microbiological 
culturing methods. 

31 . Process for preparing an organism according to Claim 29, wherein the 
alteration in eugenol and/or ferulic acid catabolism, and/or the inactivation of the 
corresponding genes, is achieved by means of recombinant DNA methods. -. 
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09/830514 

WO 00/26355 PCJ^g9/Q/©§S'PT0 2 7 APR 2001 

-37- 

CONSTRUCTION OF PRODUCTION STRAINS 
FOR PRODUCING SUBSTITUTED PHENOLS 
BY SPECIFICALLY INACTIVATING GENES OF 
THE EUGENOL AND FERULIC ACID CATABOLISM 

ABSTRACT OF THE DISCLOSURE 

The present invention relates to a transformed and/or mutagenated 
unicellular or multicellular organism which is characterized in that enzymes 
of the eugenol and/or ferulic acid catabolism are deactivated in such a 
manner that the intermediates coniferyl alcohol, coniferyl aldehyde, ferulic 
acid, vanillin and/or vanillinic acid are accumulated. 
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WO 00/26355 3 / f f x TS PCT/EP99/07952 

JC18 Rec ? d PCT/PTO 2 7 APR 200) 




Construrtingjju>di^ by 
specifically inactivating genes of eugenoLaraUkra^^ 



5 The present invention relates to the construction of production strains and to a 
process for preparing substituted methoxyphenols, in particular vanillin. 

DE-A 4 227 076 (process for preparing substituted methoxyphenols, and 
microorganism which is suitable for this purpose) describes the preparation of 
10 substituted methoxyphenols using a novel Pseudomonas sp.. The starting material in 
this context is eugenol and the products are ferulic acid, vanillic acid, coniferyl 
alcohol and coniferyl aldehyde. 

An extensive review of the biotransformations which were possible using ferulic 
15 acid, which was written by Rosazza et al. (Biocatalytic transformation of ferulic acid: 
an abundant aromatic natural product; J. Ind. Microbiol. 15:457-471), also appeared 
in 1995. 

The genes and enzymes for synthesizing coniferyl alcohol, coniferyl aldehyde, ferulic 
20 acid, vanillic and vanillin acid from Pseudomonas sp. were described in EP-A 
0 845 532. 

The enzymes for converting /rans-ferulic acid into rrans-feruloyl-SCoA ester and 
subsequently into vanillin, and also the gene for cleaving the ester, were described by 
25 the Institute of Food Research, Norwich, GB, in WO 97/35999. In 1998, the content 
of the patent also appeared in the form of scientific publications (Gasson et al. 1998. 
Metabolism of ferulic acid to vanillin. J. Biol. Chem. 273:4163-4170; Narbad and 
Gasson 1998. Metabolism of ferulic acid via vanillin using a novel CoA-dependent 
pathway in a newly isolated strain of Pseudomonas fluorescens. Microbiology 

30 144*1397 - 1405) "Express Mail" mailing label number ETl 46S93673US 

natpnfngpngit April 27, 2001 



I hereby certify that this paper or fee is being deposited with the United States 
Postal Service "Express Mail Post Office to Addressee" service under 37 CFR 
1 .10 on the date indicated above and is addressed to the Assistant Commissioner 
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DE-A 195 32 317 describes the use of Amycolatopsis sp. for obtaining vanillin from 
ferulic acid fermentatively in high yields. 

The known processes suffer from the disadvantage that they either achieve only very 
5 low yields of vanillin or make use of relatively expensive starting compounds. While 
the last-mentioned process (DE-A 195 32 317) does achieve high yields, the use of 
Pseudomonas sp. HR199 and Amycolatopsis sp. HR167 for biotransforming eugenol 
into vanillin requires a fermentation which is carried out in two steps, consequently 
leading to substantial expense and consumption of time. 

10 

The object of the present invention is therefore to construct organisms which are able 
to convert the relatively inexpensive raw material eugenol into vanillin in a one-step 
process. 

15 This object is achieved by means of constructing production strains of unicellular or 
multicellular organisms, which strains are characterized in that enzymes of eugenol 
and/or ferulic acid catabolism are inactivated such that the intermediates coniferyl 
alcohol, coniferyl aldehyde, ferulic acid, vanillin and/or vanillic acid accumulate. 

20 The production strain may be unicellular or multicellular. Accordingly, the invention 
can relate to microorganisms, plants or animals. Furthermore, use can also be made 
of extracts which are obtained from the production strain. According to the invention, 
preference is given to using unicellular organisms. These latter organisms can be 
microorganisms or animal or plant cells. According to the invention, particular 

25 preference is given to using fungi and bacteria. The highest preference is given to 
bacterial species. Those bacteria which may in particular be used, after their eugenol 
and/or ferulic acid catabolism has/have been altered, are species of Rhodococcus, 
Pseudomonas und Escherichia. 

30 In the simplest case, known, conventional microbiological methods can be used for 
isolating the organisms which may be employed in accordance with the invention. 
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Thus, the enzymic activity of the proteins involved in eugenol and/or ferulic acid 
catabolism can be altered by using enzyme inhibitors. Furthermore, the enzymic 
activity of the proteins involved in eugenol and/or ferulic acid catabolism can be 
altered by mutating the genes which encode these proteins. Such mutations can be 
5 generated in a random manner by means of classical methods, for example by using 
UV irradiation or mutation-inducing chemicals. 

Recombinant DNA methods, such as deletions, insertions and/or nucleotide 
exchanges, are likewise suitable for isolating the novel organisms. Thus, the genes of 

10 the organisms can, for example, be inactivated using other DNA elements (Q 
elements). Suitable vectors can likewise be used for replacing the intact genes with 
gene structures which are altered and/or inactivated. In this context, the genes which 
are to be inactivated, and the DNA elements which are employed for the inactivation, 
can be obtained by means of classical cloning techniques or by means of polymerase 

15 chain reactions (PCR). 

For example, in one possible embodiment of the invention, eugenol catabolism and 
ferulic acid catabolism can be altered by inserting £2 elements, or introducing 
deletions, into appropriate genes. In this context, the abovementioned recombinant 

20 DNA methods can be used to inactivate the functions of the genes, which encode 
dehydrogenases, synthetases, hydratase-adolases, thiolases or demethylases, such that 
production of the relevant enzymes is blocked. Preferably, the genes are those which 
encode coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 
feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 

25 vanillin dehdrogenases or vanillic acid demethylases. Very particular preference is 
given to genes which encode the amino acid sequences specified in EP-A 0845532 
and/or nucleotide sequences which encode their allelic variations. 

The invention accordingly also relates to gene structures for preparing transformed 
30 organisms and mutants. 
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Preference is given to employing gene structures in which the nucleotide sequences 
encoding dehydrogenases, synthetases, hydratase-aldolases, thiolases or demethylases 
are inactivated for isolating the organisms and mutants. Particular preference is given 
to gene structures in which the nucleotide sequences encoding coniferyl alcohol 
5 dehydrogenases, coniferyl aldehyde dehydrogenases, feruloyl-CoA synthetases, 
enoyl-CoA hydratase-aldolases, beta-ketothiolases, vanillin dehydrogenases or 
vanillic acid demethylases are inactivated. Very particular preference is given to gene 
structures which exhibit the structures given in Figures la to Ir having the nucleotide 
sequences which are depicted in Figures 2a to 2r and/or nucleotide sequences 
10 encoding their allelic variations. In this context, particular preference is given to 
nucleotide sequences 1 to 18. 

The invention also encompasses the part sequences of these gene structures as well as 
functional equivalents. Functional equivalents are to be understood as meaning those 
15 derivatives of the DNA in which individual nucleobases have been exchanged 
(wobble exchanges) without the function being altered. Amino acids may also be 
exchanged at the protein level without this resulting in an alteration in function. 

One or more DNA sequences can be inserted upstream and/or downstream of the 
20 gene structures. By cloning the gene structures, it is possible to obtain plasmids or 
vectors which are suitable for the transformation and/or transfection of an organism 
and/or for conjugative transfer into an organism. 

The invention furthermore relates to plasmids and/or vectors for preparing the 
25 organisms and mutants which are transformed in accordance with the invention. 
These organisms and mutants consequently harbour the gene structures which have 
been described. The present invention accordingly also relates to organisms which 
harbour the said plasmids and/or vectors. 

30 The nature of the plasmids and/or vectors depends on what they are being used for. In 
order, for example, to be able to replace the intact genes of eugenol and/or ferulic 
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acid catabolism in pseudomonads with the genes which have been inactivated with 
omega elements, there is a need for vectors which, on the one hand, can be 
transferred into pseudomonads (conjugatively transferable plasmids) but which, on 
the other hand, cannot be replicated in these organisms and are consequently unstable 
5 in pseudomonads (so-called suicide plasmids). DNA segments which are transferred 
into pseudomonads with the aid of such a plasmid system can only be retained if they 
become integrated by homologous recombination into the genome of the bacterial 
cell. 

10 The described gene structures, vectors and plasmids may be used for preparing 
different transformed organisms or mutants. The said gene structures can be used for 
replacing intact nucleic acid sequences with altered and/or inactivated gene 
structures. In the cells, which can be obtained by transformation or transfection or 
conjugation, the intact gene is replaced, by homologous recombination, with the 

15 altered and/or inactivated gene structure, as a consequence of which the resulting 
cells now only possess the altered and/or inactivated gene structure in their genome. 
In this way, preferably genes can be altered and/or inactivated, in accordance with the 
invention, such that the relevant organisms are able to produce coniferyl alcohol, 
coniferyl aldehyde, ferulic acid, vanillin and/or vanillic acid. 

20 

Mutants of the strain Pseudomonas sp. HR199 (DSM 7063), which was described in 
detail in DE-A 4 227 076 and EP-A 0845532, are examples of production strains 
which have been constructed in this way in accordance with the invention, with the 
corresponding gene structures ensuing, inter alia, from Figures la to lr, in 
25 combination with Figures 2a to 2r: 

1. Pseudomonas sp. HR199ca/AQKm, which contains the £2Km-inactivated 
calA gene in place of the intact calA gene encoding coniferyl alcohol 
dehydrogenase (Fig. la; Fig. 2a). 
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2. Pseudomonas sp. HR199ca/A£2Gm, which contains the QGm-inactivated 
calA gene in place of the intact calA gene encoding coniferyl alcohol 
dehydrogenase (Fig. lb; Fig. 2b). 

3. Pseudomonas sp. HR199ca/AA, which contains the deletion-inactivated calA 
5 gene in place of the intact calA gene encoding coniferyl alcohol 

dehydrogenase (Fig. lc; Fig. 2c). 

4. Pseudomonas sp. HR199ca/£QKm, which contains the QKm-inactivated 
calB gene in place of the intact calB gene encoding coniferyl aldehyde 
dehydrogenase (Fig. Id; Fig. 2d) 

10 5. Pseudomonas sp. HR199ca/B£2Gm, which contains the £2Gm-inactivated 
calB gene in place of the intact calB gene encoding coniferyl aldehyde 
dehydrogenase (Fig. le; Fig. 2e). 

6. Pseudomonas sp. HR199ca/5A, which contains the deletion-inactivated calB 
gene in place of the intact calB gene encoding coniferyl aldehyde 

15 dehydrogenase (Fig. If; Fig. 2f). 

7. Pseudomonas sp. HR199/cs£2Km, which contains the £2Km-inactivated fcs 
gene in place of the intact fcs gene encoding feruloyl-CoA synthetase (Fig.lg; 
Fig. 2g). 

8. Pseudomonas sp. FLR199/c\sQGm, which contains the QGm-inactivated fcs 
20 gene in place of the intact fcs gene encoding feruloyl-CoA synthetase (Fig.lh; 

Fig. 2h). 

9. Pseudomonas sp. HR199/csA, which contains the deletion-inactivated fcs 
gene in place of the intact fcs gene encoding feruloyl-CoA synthetase (Fig.li; 
Fig. 2i). 

25 10. Pseudomonas sp. HR199ecft£2Km, which contains the £2Km-inactivated ech 
gene in place of the intact ech gene encoding enoyl-CoA hydratase-aldolase 
(FigJj;Fig.2j). 

11. Pseudomonas sp. HRl99echQ,Grn y which contains the £2Gm-inactivated ech 
gene in place of the intact ech gene encoding enoyl-CoA hydratase-aldolase 
30 (Fig.lk;Fig.2k). 



-7- 



12. Pseudomonas sp. HR199ec/*A, which contains the deletion-inactivated ech 
gene in place of the intact ech gene encoding enoyl-CoA hydratase-aldolase 
(Fig.ll; Fig. 21). 

13. Pseudomonas sp. HRl99aatQKm, which contains the QKm-inactivated aat 
5 gene in place of the intact aat gene ecnoding beta-ketothiolase (Fig. lm; 

Fig. 2m). 

14. Pseudomonas sp. HR199a<2/£2Gm, which contains the £2Gm-inactivated aat 
gene in place of the intact aat gene encoding beta-ketothiolase (Fig. In; 

Fig. 2n). 

10 15. Pseudomonas sp. HR199aa?A, which contains the deletion-inactivated aat 
gene in place of the intact aat gene encoding beta-ketothiolase (Fig.lo; 2o). 
16. Pseudomonas sp. HR199wM2Km, which contains the £2Km-inactivated vdh 
gene in place of the intact vdh gene encoding vanillin dehydrogenase (Fig.lp; 
Fig. 2p). 

15 17. Pseudomonas sp. HR199v<2M2Gm, which contains the QGm-inactivated vdh 

gene in place of the intact vdh gene encoding vanillin dehydrogenase (Fig.lq; 
Fig. 2q). 

18. Pseudomonas sp. HR199vrf/zA, which contains the deletion-inactivated vdh 
gene in place of the intact vdh gene encoding vanillin dehydrogenase (Fig.lr; 

20 Fig. 2r). 

19. Pseudomonas sp. URl99vdhBQKm, which contains the £2Km-inactivated 
vdhB gene in place of the intact vdhB gene encoding vanillin dehydrogenase 

n. 

20. Pseudomonas sp. HR199vd&Z?£2Gm, which contains the £2Gm-inactivated 
25 vdhB gene in place of the intact vdhB gene encoding vanillin dehydrogenase 

n. 

21. Pseudomonas sp. HR199vJ/zi?A, which contains the deletion-inactivated vdhB 
gene in place of the intact vdhB gene encoding vanillin dehydrogenase n. 

22. Pseudomonas sp. HR199arfM2Km, which contains the QKm-inactivated adh 
30 gene in place of the intact adh gene encoding alcohol dehydrogenase. 
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23. Pseudomonas sp. HR199arfM2Gm, which contains the QGm-inactivated adh 
gene in place of the intact adh gene encoding alcohol dehydrogenase. 

24. Pseudomonas sp. HR199adh& which contains the deletion-inactivated adh 
gene in place of the intact adh gene encoding alcohol dehydrogenase. 

5 25. Pseudomonas sp. HR199vanAQKm, which contains the £2Km-inactivated 
vanA gene in place of the intact vanA gene encoding the oe-subunit of vanillic 
acid demethylase. 

26. Pseudomonas sp. HR199v<2?zA£2Gm, which contains the £2Gm-inactivated 
vanA gene in place of the intact vanA gene encoding the a-subunit of vanillic 

10 acid demethylase. 

27. Pseudomonas sp. HR199vcmAA, which contains the deletion-inactivated vanA 
gene in place of the intact vanA gene encoding the a-subunit of vanillic acid 
demethylase. 

28. Pseudomonas sp. HR199vani?£2Km, which contains the £2Km-inactivated 
15 vanB gene in place of the intact vanB gene encoding the (3-subunit of vanillic 

acid demethylase. 

29. Pseudomonas sp. HR199van J B£2Gm, which contains the QGm-inactivated 
vanB gene in place of the intact vanB gene encoding the [3-subunit of vanillic 
acid demethylase. 

20 30. Pseudomonas sp. HR199vanSA, which contains the deletion-inactivated vanB 
gene in place of the intact vanB gene encoding the (3-subunit of vanillic acid 
demethylase. 



The invention additionally relates to a process for the biotechnological preparation of 
25 organic compounds. In particular, this process can be used to prepare alcohols, 
aldehydes and organic acids. The latter are preferably coniferyl alcohol, coniferyl 
aldehyde, ferulic acid, vanillin and vanillic acid. 

The above-described organisms are employed in the novel process. The organisms 
30 which are very particularly preferred include bacteria, in particular the Pseudomonas 



species. Specifically, the abovementioned Pseudomonas species can preferably be 
employed for the following processes: 

1. Pseudomonas sp. HR199caL4QKm, Pseudomonas sp. HR199ca/A£2Gm and 
Pseudomonas sp. HR199m/AA for preparing coniferyl alcohol from eugenoL 

2. Pseudomonas sp. HR199ca/B£2Km, Pseudomonas sp. HR199ca/S£2Gm and 
Pseudomonas sp. HR199ca/SA for preparing coniferyl aldehyde from eugenol 
or coniferyl alcohol. 

3. Pseudomonas sp. HR199/cs£2Km, Pseudomonas sp. HR199/cs£2Gm, Pseu- 
domonas sp. HR199fcsA, Pseudomonas sp. HR199ecM2Km, Pseudomonas 
sp. HR199ecftQGm and Pseudomonas sp. HR199^c/iA for preparing ferulic 
acid from eugenol or coniferyl alcohol or coniferyl aldehyde. 

4. Pseudomonas sp. HR199v<f/z£2Km, Pseudomonas sp. KR199w#z£2Gm, Ps^w- 
domonas sp. HR199w#Ld, Pseudomonas sp. HR199vrfM2Gmvd/i5QKm, 
Pseudomonas sp. HR199vrf/zQKmvJ/i££2Gm, Pseudomonas sp. HR199w#iA 
v<i/*B£2Gm and Pseudomonas sp. HR199v^Av^/zB£2Km for preparing 
vanillin from eugenol or coniferyl alcohol or coniferyl aldehyde or ferulic 
acid. 

5. Pseudomonas sp. HR199vanA£2Km, Pseudomonas sp. HR199varaA£2Gm, 
Pseudomonas sp. HR199vanAA, Pseudomonas sp. HR199van5£2Km, 
Pseudomonas sp. HR199varci?£2Gm and Pseudomonas sp. HR199ranBA for 
preparing vanillic acid from eugenol or coniferyl alcohol or coniferyl 
aldehyde or ferulic acid or vanillin. 

Eugenol is the preferred substrate. However, it is also possible to add further 
substrates or even to replace the eugenol with another substrate. 
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Suitable nutrient media for the organisms which are employed in accordance with the 
invention are synthetic, semisynthetic or complex culture media. These media may 
comprise carbon-containing and nitrogen-containing compounds, inorganic salts, 
where appropriate trace elements, and vitamins. 

5 

Carbon-containing compounds which may be suitable are carbohydrates, 
hydrocarbons or organic standard chemicals. Examples of compounds which may 
preferably be used are sugars, alcohols or sugar alcohols, organic acids or complex 
mixtures. 

10 

The sugar is preferably glucose. The organic acids which may preferably be 
employed are citric or acetic acid. Examples of the complex mixtures are malt 
extract, yeast extract, casein or casein hydrolysate. 

15 Inorganic compounds are suitable nitrogen-containing substrates. Examples of these 
are nitrates and ammonium salts. Organic nitrogen sources can also be used. These 
sources include yeast extract, soya bean meal, casein, casein hydrolysate and corn 
steep liquor. 

20 Examples of the inorganic salts which may be employed are sulphates, nitrates, 
chlorides, carbonates and phosphates. The metals which the said salts contain are 
preferably sodium, potassium, magnesium, manganese, calcium, zinc and iron. 

The temperature for the culture is preferably in the range from 5 to 100°C. The range 
25 from 15 to 60°C is particularly preferred, with 22 to 37°C being most preferred. 

The pH of the medium is preferably 2 to 12. The range from 4 to 8 is particularly 
preferred. 

30 In principle, any bioreactor known to the skilled person can be employed for carrying 
out the novel process. Preferential consideration is given to any appliance which is 
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suitable for submerged processes. This means that vessels which do or do not possess 
a mechanical mixing device may be employed in accordance with the invention. 
Examples of the latter are shaking apparatuses, and bubble column reactors or loop 
reactors. The former preferably include all the known appliances which are fitted 
5 with stirrers of any design. 

The novel process can be carried out continuously or batchwise. The fermentation 
time required for achieving a maximum quantity of product depends on the specific 
nature of the organism employed. However, in principle, the fermentation times are 
10 between 2 and 200 hours. 

The invention is explained in more detail below while referring to examples: 

Mutants of the eugenol-utilizing strain Pseudomonas sp. HR199 (DSM 7063) were 

15 generated in a targeted manner by specifically inactivating genes of eugenol 
catabolism by means of inserting omega elements or introducing deletions. The 
omega elements employed were DNA segments which encoded resistances to the 
antibiotics kanamycin (£2Km) and gentamycin (QGm). These resistance genes were 
isolated from Tn5 and the plasmid pBBRlMCS-5 using standard methods. The genes 

20 calA, calB, fcs, ech, aat, vdh, adh, vdhB, vanA and vanB, which encode coniferyl 
alcohol dehydrogenase, coniferyl aldehyde dehydrogenase, feruloyl-CoA synthetase, 
enoyl-CoA hydratase-aldolase, beta-ketothiolase, vanillin dehdrogenase, alcohol 
dehydrogenase, vanillin dehdrogenase II and vanillic acid demethylase, were isolated 
from genomic DNA of the strain Pseudomonas sp. HR199 using standard methods 

25 and cloned into pBluescript SK". By means of digesting with suitable restriction 
endonucleases, DNA segments were removed from these genes (deletion) or 
substituted with £2 elements (insertion), resulting in the respective gene being 
inactivated. The genes which had been mutated in this manner were recloned into 
conjugatively transferable vectors and subsequently introduced into the strain 

30 Pseudomonas sp. HR199. Suitable selection was used to obtain transconjugants 
which had replaced the respective functional wild-type gene with the newly 
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introduced inactivated gene. The insertion and deletion mutants which were obtained 
in this way now only possessed the respective inactivated gene. This procedure was 
used to obtain both mutants possessing only one defective gene and multiple mutants, 
in which several genes had been inactivated in this manner. These mutants were 
5 employed for biotransforming 

a) eugenol into coniferyl alcohol, coniferyl aldehyde, ferulic acid, vanillin and/or 
vanillic acid; 

b) coniferyl alcohol into coniferyl aldehyde, ferulic acid, vanillin and/or vanillic acid; 

c) coniferyl aldehyde into ferulic acid, vanillin and/or vanillic acid; 
10 d) ferulic acid into vanillin and/or vanillic acid, and 

e) vanillin into vanillic acid. 
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Materials and Methods 

Conditions for growing the bacteria. 

Strains of Escherichia coli were propagated at 37°C in Luria-Bertani (LB) or M9 
mineral medium (J. Sambrook, E. F. Fritsch and T. Maniatis. 1989. Molecular 
cloning: a laboratory manual. 2nd Edition., Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York). Strains of Pseudomonas sp. were propagated at 
30°C in Nutrient Broth (NB, 0.8%, wt/vol) or in mineral medium (MM) (H. G. 
Schlegel, et al. 196 L Arch. Mikrobiol. 38:209-222) or HR mineral medium (HR- 
MM) (J. Rabenhorst, 1996. Appl. Microbiol. Biotechnol. 46:470-474.). Ferulic acid, 
vanillin, vanillic acid and protocatechuic acid were dissolved in dimethyl sulphoxide 
and added to the respective medium to give a final concentration of 0.1% (wt/vol). 
Eugenol was either added directly to the medium to give a final concentration of 
0.1% (vol/vol) or applied to filter paper (circular filter 595, Schleicher & Schuell, 
Dassel, Germany) in the lids of MM agar plates. When transconjugants and mutants 
of Pseudomonas sp. were being propagated, tetracycline, kanamycin and gentamycin 
were employed in final concentrations of 25 ptg/ml, 100 j^g/ml and 7.5 /Jg/ml, 
respectively. 

Qualitative and quantitative detection of metabolic intermediates in culture 
supernatants. 

Culture supernatants were analysed by high pressure liquid chromatography (Knauer 
HPLC) either directly or after dilution with doubly distilled H2O. The 
chromatography was carried out on Nucleosil 100 C18 (7 fim, 250 x 4 mm). 0.1% 
(vol/vol) formic acid and acetonitrile was used as the solvent. The course of the 
gradient employed for eluting the substances was as follows: 

00:00 - 06:30 -> 26% acetonitrile 
06:30 - 08:00 -» 100% acetonitrile 
08:00 - 12:00 -» 100% acetonitrile 
12:00 - 13:00 -> 26% acetonitrile 
13:00 - 18:00 -> 26% acetonitrile 
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Purification of vanillin dehydrogenase IL 

The purification was carried out at 4°C. 

5 Crude extract. 

Pseudomonas sp. HR199 cells which had been propagated on eugenol were washed 
in 10 mM sodium phosphate buffer, pH 6.0, then resuspended in the same buffer and 
disrupted by being passed twice through a French press (Amicon, Silver Spring, 
Maryland, USA) at a pressure of 1000 psi. The cell homogenate was subjected to an 
10 ultracentrifugation (1 h, 100,000 x g, 4°C), resulting in the soluble fraction of crude 
extract being obtained as the supernatant. 

Anion exchange chromatography on DEAE SephaceL 

The soluble fraction of the crude extract was dialysed overnight against 10 mM 
15 sodium phosphate buffer, pH 6.0. The dialysate was loaded onto a DEAE-Sephacel 

column (2.6 cm x 35 cm, bed volume[BV]: 186 ml) which had been equilibrated 
with 10 mM sodium phosphate buffer, pH 6.0, and which had a flow rate of 
0.8 ml/min. The column was rinsed with two BV of 10 mM sodium phosphate 
buffer, pH 6.0. The vanillin dehydrogenase II (VDH II) was eluted with a linear salt 
20 gradient of from 0 to 400 mM NaCl in 10 mM sodium phosphate buffer, pH 6.0 (750 
ml). 10 ml fractions were collected. Fractions having a high VDH II activity were 
combined to form the DEAE pool. 

Determining the vanillin dehydrogenase activity. 

25 The VDH activity was determined at 30°C using an optical enzymic test. The 
reaction mixture, whose volume was 1 ml, contained 0.1 mmol of potassium 
phosphate (pH 7.1), 0.125 jLtmol of vanillin, 0.5 jumol of NAD, 1.2 jumol of pyruvate 
(Na salt), lactate dehydrogenase (1 U; from pig heart) and enzyme solution. The 
oxidation of vanillin was monitored at X = 340 nm (£ V anillin =11.6 cm 2 /jLtmol). The 

30 enzyme activity was given in units (U), with 1 U corresponding to the quantity of 
enzyme which converts 1 fimol of vanillin per minute. The protein concentrations in 
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the samples were determined using the method of Lowry et al. (O. H. Lowry, N. J. 
Rosebrough, A. L. Farr and R. J. Randall. 1951. J. Biol. Chem. 193:265-275). 

Determining the coniferyl alcohol dehydrogenase activity. 

5 The CADH activity was determined at 30°C using an optica] enzymic test in 
accordance with Jaeger et al. (E. L. Jaeger, Eggeling and H. Sahm. 1981. Current 
Microbiology. 6:333-336). The reaction mixture, whose volume was 1 ml, contained 
0.2 mmol of tris/HCl (pH 9.0), 0.4 fimol of coniferyl alcohol, 2 jimol of NAD, 
0.1 mmol of semicarbazide and enzyme solution. The reduction of NAD was 
10 monitored at X = 340 nm (8 = 6.3 cm^/fimol). The enzyme activity was given units 
(U), with 1 U corresponding to the quantity of enzyme which converts 1 j-imol of 
substrate per minute. The protein concentrations in the samples were determined by 
the method of Lowry et al. (O. H. Lowry, N. J. Rosebrough, A. L. Farr and R. 
J. Randall. 1951. J. Biol. Chem. 193:265-275). 

15 

Determining the coniferyl aldehyde dehydrogenase activity. 

The CALDH activity was determined at 30°C using an optical enzymic test. The 
reaction mixture, whose volume was 1 ml, contained 0.1 mmol of tris/HCl (pH 8.8), 
0.08 /xmol of coniferyl aldehyde, 2.7 jumol of NAD and enzyme solution. The 

20 oxidation of coniferyl aldehyde to ferulic acid was monitored at A, = 400 nm (e = 

34 cm2//xmol). The enzymic activity was given in units (U) with 1 U corresponding 
to the quantity of enzyme which converts 1 /imol of substrate per minute. The protein 
concentrations in the samples were determined by the method of Lowry et al. (O. H. 
Lowry, N. J. Rosebrough, A. L. Farr and R. J. Randall. 1951. J. Biol. Chem. 

25 193:265-275). 

Determining the feruloyl-CoA synthetase (ferulic acid thiokinase) activity. 

The FCS activity was determined at 30°C using an optical enzymic test which was a 
modification of that of Zenk et aL (Zenk et al. 1980. Anal. Biochem. 101:182-187). 
30 The reaction mixture, whose volume was 1 ml, contained 0.09 mmol of potassium 
phosphate (pH 7.0), 2.1 /xmol of MgCl2, 0.7 /imol of ferulic acid, 2 /nnol of ATP, 
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0.4 /imol of coenzyme A and enzyme solution. The formation of the CoA ester from 
ferulic acid was monitored at A, = 345 nm (e = 10 cm^Z/xmol). The enzymic activity 
was given in units (U), with 1 U corresponding to the quantity of enzyme which 
converts 1 jimol of substrate per minute. The protein concentrations in the samples 
5 were determined using the method of Lowry et al. (O. H. Lowry, N. J. Rosebrough, 
A. L. Fair and R. J. Randall. 1951. J. Biol. Chem. 193:265-275). 

Electrophoretic methods* 

Protein-containing extracts were fractionated under native conditions in 7.4% 
10 (wt/vol) polyacrylamide gels using the method of Stegemann et al. (Stegemann et al. 
1973. Z. Naturforsch. 28c:722-732) and under denaturing conditions in 11.5% 
(wt/vol) polyacrylamide gels using the method of Laemmli (Laemmli, U. K. 1970. 
Nature (London) 227:680-685). Serva Blue R was used for non-specific protein 
staining. For specifically staining the coniferyl alcohol dehydrogenase, coniferyl 
15 aldehyde dehydrogenase and vanillin dehydrogenase, the gels were rebuffered for 

20 min in 100 mM KP buffer (pH 7.0) and subequently incubated at 30°C in the 
same buffer to which 0.08% (wt/vol) NAD, 0.04% (wt/vol) p-nitro blue tetrazolium 
chloride, 0.003% (wt/vol) phenazine methosulphate and 1 mM of the respective 
substrate had been added until corresponding colour bands became visible. 

20 

Transfer of proteins from polyacrylamide gels to PVDF membranes. 

Proteins were transferred from SDS-polyacrylamide gels to PVDF membranes 
(Waters-Millipore, Bedford, Mass., USA) using a Semidry Fastblot appliance 
(B32/33, Biometra, Gottingen, Germany) in accordance with the manufacturer's 
25 instructions. 

Determining N-terminal amino acid sequences. 

N-terminal amino acid sequences were determined using a Protein Peptide Sequencer 
(Type 477 A, Applied Biosystems, Foster City, USA) and a PTH analyser in 
30 accordance with the manufacturer's instructions. 
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Isolating and manipulating DNA 

Genomic DNA was isolated using the method of Marmur (J. Marmur, 1961. J. MoL 
Biol. 3:208-218). Other plasmid DNA and/or DNA restriction fragments was/were 
isolated and analysed using standard methods (J. E, Sambrook, F. Fritsch and 
5 T. Maniatis. 1989. Molecular cloning: a laboratory manual. 2nd Edition., Cold Spring 
Harbor Laboratory Press, Cold Spring Habor, New York). 

Transferring DNA* 

Competent Escherichia coli cells were prepared and transformed using the method of 
Hanahan (D. Hanahan, 1983. J. Mol. Biol. 166:557-580). Conjugative plasmid 
transfer between plasmid-harbouring Escherichia coli SI 7-1 strains (donor) and 
Pseudomonas sp.strains (recipient) was performed on NB agar plates in accordance 
with the method of Friedrich et al. (B. Friedrich et al. 198 L J. BacterioL 147:198- 
205), or by means of a "minicomplementation method" on MM agar plates 
containing 0.5% (wt/vol) gluconate as the C source and 25 /ig of tetracycline/ml or 
100 fig of kanamycin/ml. In this case, cells of the recipient were applied in one 
direction as an inoculation streak. After 5 min, cells of the donor strains were then 
applied as inoculation streaks, with these streaks crossing the recipient inoculation 
streak. After incubating at 30°C for 48 h, the transconjugants grew directly 
downstream of the crossing site whereas neither the donor strain nor the recipient 
strain was able to grow. 

Hybridization experiments. 

DNA restriction fragments were fractionated electrophoretically in a 0.8% (wt/vol) 
25 agarose gel in 50 mM tris- 50 mM boric acid- 1.25 mM EDTA buffer (pH 8.5) (J. R 
Sambrook, F. Fritsch and T. Maniatis. 1989. Molecular cloning: a laboratory manual. 
2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York.). 
The transfer of the denatured DNA out of the gel onto a positively charged nylon 
membrane (pore size: 0.45 /xm, Pall Filtrationstechnik, Dreieich, Germany), the 
30 subsequent hybridization with biotinylated or digoxigenin-labelled DNA probes, and 
the preparation of these DNA probes, were all performed using standard methods 
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(J. E. Sambrook, F. Fritsch and T. Maniatis. 1989. Molecular cloning: a laboratory 
manual. 2nd Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
New York). 

5 DNA sequencing. 

Nucleotide sequences were determined "non-radioactively" in accordance with the 
Sanger et al. (Sanger et al. 1977. Proc. Natl. Acad. Sci. USA 74:5463-5467) dideoxy 
chain termination method using a "LI-COR" DNA Sequencer Model 4000L" 
(LI-COR Inc., Biotechnology Division, Lincoln, NE, USA) and using a "thermo 
10 sequenase fluorescent labelled primer cycle sequencing kit with 7-deaza-dGTP" 
(Amersham Life Science, Amersham International pic, Little Chalfont, 
Buckinghamshire, England), in each case in accordance with the manufacturer's 
instructions. 

15 Synthetic oligonucleotides were used to carry out sequencing in accordance with the 

"primer-hopping strategy" of Strauss et al. (E. C. Strauss et al. 1986. Anal. Biochem. 
154:353-360). 

Chemicals, biochemicals and enzymes. 

20 Restriction enzymes, T4 DNA ligase, lambda DNA and enzymes and substrates for 
the optical enzymic tests were obtained from C.F. Boehringer & Sohne (Mannheim, 
Germany) or from GIBCO/BRL (Eggenstein, Germany). [y-^^P]ATP was from 
Amersham/Buchler (Braunschweig, Germany). Oligonucleotides were obtained from 
MWG-Biotech GmbH (Ebersberg, Germany). Type NA agarose was obtained from 

25 Pharmacia-LKB (Uppsala, Sweden). All other chemicals were from Haarmann & 
Reimer (Holzminden, Germany), E. Merck AG (Darmstadt, Germany), Fluka 
Chemie (Buchs, Switzerland), Serva Feinbiochemica (Heidelberg, Germany) or 
Sigma Chemie (Deisenhofen, Germany). 
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Examples 
Example 1 

Constructing omega elements which mediate resistances to kanamycin (£2 Km) 
or gentamycin (£2Gm). 

For constructing the QKm element, the 2099 bp Bgll fragment of Transposons Tn5 
(E. A. Auerswald, G. Ludwig and EL Schaller. 1981. Cold Spring Harb. Symp. 
Quant. Biol. 45:107-113; E. Beck, G. Ludwig, E. A. Auerswald, B. Reiss and H. 
Schaller. 1982. Genes 19:327-336; P. Mazodier, P. Cossart, E. Giraud and F. Gasser. 
1985. Nucleic Acids Res. 13:195-205) was isolated on a preparative scale. The 
fragment was shortened down to approx. 990 bp by treating it with Bal 31 nuclease. 
This fragment, which now only comprised the kanamycin resistance gene (encoding 
an aminoglycoside-3 -O-phosphotransferase), was then ligated to Smal-cut pSKsym 
DNA (pBluescript SK~ derivative which contains a symmetrically constructed 
multiple cloning site [Sail, HindUl, EcoRl, Smal, EcoRl, HindHl, Sail]). It was 
possible to reisolate the QKm element from the resulting plasmid as a Smal 
fragment, an EcoRl fragment, a HindUL fragment or a Sail fragment. 

20 For constructing the QGm element, the 983 bp Eael fragment of the plasmid 
pBRlMCS-5 (M. E. Kovach, P. EL Elzer, D. S. Hill, G. T. Robertson, M. A. Farris, 
R. M. Roop and K. M. Peterson. 1995. Genes 166:175-176) was isolated on a 
preparative scale and then treated with mung bean nuclease (progressive digestion of 
single-stranded DNA molecule ends). This fragment, which now only comprised the 

25 gentamycin resistance gene (encoding a gentamycin-3-acetyltransferase), was then 
ligated to Smal-cleaved pSKsym DNA (see above). It was possible to reisolate the 
£2Gm element from the resulting plasmid as a Smal fragment, an EcoRl fragment, a 
Hindm fragment or a Sail fragment. 

30 
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Example 2 

Cloning the genes from Pseudomonas sp. HR199 (DSM7063) which were to be 
inactivated by inserting £2 elements or by means of deletions. 

5 The fcs, ech, vdh and aat genes were cloned separately proceeding from the E. coli 
S17-1 strains DSM 10439 and DSM 10440 and using the plasmids pE207 and pE5-l 
(see EP-A 0845532). The given fragments were isolated on a preparative scale from 
these plasmids and treated as described below: 

10 For cloning the fcs gene, the 2350 bp SaWEcoRl fragment from plasmid pE207 and 
the 3700 bp EcoRJ/Sall fragment from plasmid pE5-l were cloned together in 
pBluescript SK~ such that the two fragments were joined together by way of the 
EcoRl ends. The 6050 bp Sail fragment was isolated on a preparative scale from the 
resulting hybrid plasmid and shortened down to approx. 2480 bp by being treated 

15 with Bal 31 nuclease. Pstl linkers were subsequently ligated to the ends of the 
fragment and, after digestion with Pstl, the fragment was cloned into pBluescript SK" 
(pSK/cs). After transformation of E. coli XL1 blue, clones were obtained which 
expressed the fcs gene and exhibited an FCS activity of 0.2 U/mg of protein. 

20 For cloning the ech gene, the 3800 bp HindTWEcoRl fragment from plasmid pE207 
was isolated on a preparative scale and shortened down to approx. 1470 bp by 
treating it with Bal 31 nuclease. EcoRl linkers were then ligated to the ends of the 
fragment and, after digestion with EcoRl, the fragment was cloned into pBluescript 
SK' (pSKecA). 

25 

For cloning the vdh gene, the 2350 bp SaWEcoRl fragment from plasmid pE207 was 
isolated on a preparative scale. After cloning into pBluescript SK", the fragment was 
truncated at one end by approx. 1530 bp using an exonuclease IH/mung bean 
nuclease system. An EcoRl linker was then ligated to the end of the fragment and, 
30 after digestion with EcoRl, the fragment was cloned into pBluescript SK" (pSKvdh). 
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Following transformation of E. coli XL1 blue, clones were obtained which expressed 
the VDH gene and exhibited a VDH activity of 0.01 U/mg of protein. 

For cloning the aat gene, the 3700 bp EcoRUSaFL fragment from plasmid pE5-l was 
5 isolated on a preparative scale and shortened down to approx. 1590 bp by treating it 
with Bal 31 nuclease. EcoRl linkers were then ligated to the ends of the fragment 
and, after digestion with EcoRl, the fragment was cloned into pBluescript SK" 
(pSKaat). 

10 Example 3 

Inactivating the above-described genes by inserting £2 elements or by deleting 
constituent regions of these genes. 

Plasmid pSK/cs, which contained the fcs gene, was digested with BssHE, resulting in 
15 a 1290 bp fragment being excised from the fcs gene. Following religation, the 
deletion derivative of the fcs gene (fcs A) (see Figs, li and 2i) was obtained in cloned 
form in pBluescript SK" (pSK/csA). In addition, after the fragment had been excised, 
the omega elements QKm and QGm were ligated in in its stead. This resulted in the 
Q-inactivated derivatives of the fcs gene (fcsQKm, see Figs, lg and 2g) and 
20 (fcsQGm, see Fig. lh and 2h) being obtained in cloned form in pBluescript SK" 
(pSK/esQKm and pSK/cs£2Gm). It was not possible to detect any FCS activity in 
crude extracts of the resulting E. coli clones, whose hybrid plasmids possessed an/cs 
gene which was inactivated by deletion or by Q element insertion. 

25 Plasmid pSKech y which contained the ech gene, was digested with Nrul, resulting in 
a 53 bp fragment and a 430 bp fragment being excised from the ech gene. After 
religation, the deletion derivative of the ech gene (echA, see Fig. 11 and 21) was 
obtained in cloned form in pBluescript SK" (pSKecM). In addition, after the 
fragments had been excised, the omega elements QKm and £2Gm were ligated in in 

30 their stead. This resulted in the Q-inactivated derivatives of the ech gene (echQKm 
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and echQ&m) being obtained in cloned form in pBluescript SK" (pSK^c/zQKm and 
pSKecftQGm). 

Plasmid pSKvJ/z, which contained the vdh gene, was digested with BssWL* resulting 
5 in a 210 bp fragment being excised from the vdh gene. After religation, the deletion 
derivative of the vdh gene (vdhA^ see Figs, lo and 2o) was obtained in cloned form in 
pBluescript SK" (pSKvdhA). In addition, after the fragment had been excised, the 
omega elements QKm and £2Gm were ligated in in its stead. This resulted in the Q- 
inactivated derivatives of the vdh gene (vdhQKm and vdhQ,Gm) being obtained in 
10 cloned form in pBluescript SK" (pSKvJ/z£2Km, see Figs, lm and 2m) and 
(pSKvJ/zQGm, see Figs. In and 2n). It was not possible to detect any VDH activity 
in crude extracts of the resulting E. coli clones, whose hybrid plasmids possessed a 
vdh gene which was inactivated by deletion or by Q element insertion. 

15 Plasmid pSKaat, which contained the aat gene, was digested with BssHBl, resulting 
in a 59 bp fragment being excised from the add gene. After religation, the deletion 
derivative of the aat gene (aatA, see Figs, lr and 2r) was obtained in cloned form in 
pBluescript SK" (pSKaatA). In addition, after the fragment had been excised, the 
omega elements £2Km and QGm were ligated in in its stead. This resulted in the Q- 

20 inactivated derivatives of the aat gene (aatQKm, see Figs, lp and 2p) and (aaf£2Gm, 
see Figs. Iq and 2q) being obtained in cloned form in pBluescript SK" (pSKaatQKm 
and pSKaarQGm). 
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Example 4 

Subcloning the SI element-inactivated genes into the conjugatively transferable 
"suicide plasmid" pSUP202. 

5 In order to be able to replace the intact genes in Pseudomonas sp. HR199 with the £l~ 
element inactivated genes, there is a need for a vector which can, on the one hand, be 
transferred into pseudomonads (conjugatively transferable plasmids) but which, on 
the other hand, cannot replicate in these bacteria and is consequently unstable in 
pseudomonads ("suicide plasmid"). DNA segments which are transferred into 

10 pseudomonads using such a plasmid system can only be retained if they are 
integrated by means of homologous recombination (RecA-dependent recombination) 
into the genome of the bacterial cell. In the present case, the "suicide plasmid" 
pSUP202 (Simon et al. 1983. In: A. Piihler. Molecular genetics of the bacteria-plant 
interaction. Springer Verlag, Berlin, Heidelberg, New York, pp. 98-106) was used. 

15 

Following digestion with Pstl, the inactivated genes fcsQKm and fcsQGm were 
isolated from plasmids pSK/cs£2Km and pSK/c\y£2Gm and ligated to RsrI-cleaved 
pSUP202 DNA. The ligation mixtures were transformed into E, coli SI 7-1. Selection 
took place on tetracycline-containing LB medium which also contained kanamycin or 
20 gentamycin, respectively. Kanamycin-resistant transformants whose hybrid plasmid 
(pSUP/csQKm) contained the inactivated gene fcsQKm were obtained. The 
corresponding hybrid plasmid (pSUP/es£2Gm) of the gentamycin-resistant 
transformants contained the inactivated gene fcsQGm. 

25 Following EcoRI digestion, the inactivated genes ech£2Km and echQ&m were 
isolated from plasmids pSKecftQKm and pSKec/i£2Gm and ligated to EcoRI-cleaved 
pSUP202 DNA. The ligation mixtures were transformed into E. coli S 17-1. Selection 
took place on tetracycline-containing LB medium which also contained kanamycin or 
gentamycin, respectively. Kanamycin-resistant transformants whose hybrid plasmid 

30 (pSUPecftQKm) contained the inactivated gene echQKm were obtained. The 
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corresponding hybrid plasmid (pSUPec/i£2Gm) of the gentamycin-resistant 
transformants contained the inactivated gene echQGm. 

Following EcoRI digestion, the inactivated genes vdhQKm and vdhQGm were 
5 isolated from plasmids pSKvdhQKm and pSKvtf/z&Gm and ligated to EcoRI-cleaved 
pSUP202 DNA. The ligation mixtures were transformed into E. coli SI 7-1. Selection 
took place on tetracycline-containing LB medium which also contained kanamycin or 
gentamycin, respectively. Kanamycin-resistant transformants whose hybrid plasmid 
(pSUPvdMiKm) contained the inactivated gene vdhQKm were obtained. The 
10 corresponding hybrid plasmid (pSUPwM2Gm) of the gentamycin-resistant 
transformants contained the inactivated gene vdhQGm. 

Following EcoRl digestion, the inactivated genes aatQKm and aatQGm were 
isolated from plasmids ySKaatQYLm and pSKaatQGm and ligated to £cc?RI-cleaved 

15 pSUP202 DNA. The ligation mixtures were transformed into E. coli S17-1. Selection 
took place on tetracycline-containing LB medium which also contained kanamycin or 
gentamycin, respectively. Kanamycin-resistant transformants whose hybrid plasmid 
(pSUTWQKm) contained the inactivated gene aatQKm were obtained. The 
corresponding hybrid plasmid (pSUPaatQGm) of the gentamycin-resistant 

20 transformants contained the inactivated gene aatQGm. 

Example 5 

Subcloning the deletion-inactivated genes into the conjugatively transferable 
25 "suicide plasmid" PHE55, which possesses the "sacB selection system". 

In order to be able to replace the intact genes in Pseudomonas sp. HR199 with the 
deletion-inactivated genes, there is a need for a vector which possesses the properties 
which have already been described in the case of pSUP202. Since no possibility (no 
antibiotic resistance) exists of selecting for successful replacement of the genes in 
30 Pseudomonas sp. HR199 in the case of deletion-inactivated genes, in contrast to the 
Q element-inactivated genes, another selection system had to be used. In the "sacB 
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selection system", the replacing, deletion-inactivated gene is cloned in a plasmid 
which possesses the sacB gene in addition to an antibiotic resistance gene. Following 
the conjugative transfer of this hybrid plasmid into a pseudomonad, the plasmid is 
integrated by means of homologous recombination at the site in the genome at which 

5 the intact gene is located (first crossover). This results in a "heterogenotic" strain 
which possesses both an intact gene and a deletion-inactivated gene, with these genes 
being separated from each other by the pHE55 DNA. These strains exhibit the 
resistance which is encoded by the vector and also possess an active sacB gene. The 
intention then is that the pHE55 DNA, together with the intact gene, should then be 

10 separated out of the genomic DNA by means of a second homologous recombination 
event (second crossover). This recombination event results in a strain which now 
only possesses the inactivated gene. In addition, the pHE55-coded antibiotic 
resistance and the sacB gene are both lost. If strains are streaked on sucrose- 
containing media, the growth of strains which express the sacB gene is inhibited 

15 since the gene product converts sucrose into a polymer which is accumulated in the 
periplasm of the cells. The growth of cells which no longer carry the sacB gene as a 
result of the second recombination event having taken place is consequently not 
inhibited. In order to have a possibility of selecting phenotypically for the integration 
of the deletion-inactivated gene, this gene is not exchanged for an intact gene; 

20 instead, use is made of a strain in which the gene to be replaced is already "labelled" 
by the insertion of an £2 element. When successful replacement takes place, the 
resulting strain loses the antibiotic resistance which is encoded by the £2 element. 

Following digestion with Pstl, the inactivated gene fcsA was isolated from plasmid 
25 pSKfcsA and ligated to Psfl-cleaved pHE55 DNA. The ligation mixture was 
transformed into E. coli S17-1. Selection took place on tetracycline-containing LB 
medium. Tetracycline-resistant transformants, whose hybrid plasmid (pHE/csA) 
contained the inactivated gene/csA, were obtained. 

30 Following digestion with EcdSl, the inactivated gene echA was isolated from 
plasmid pSKechA and treated with mung bean nuclease (generation of blunt ends). 
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The fragment was Hgated to BamHI-cleaved and mung bean nuclease-treated pHE55 
DNA. The ligation mixture was transformed into E. coli S17-1. Selection took place 
on tetracycline-containing LB medium. Tetracycline-resistant transformants, whose 
hybrid plasmid (pHEechA) contained the inactivated gene echA, were obtained 

Following digestion with EcoBl, the inactivated gene vdhA was isolated from 
plasmid pSKvdhA and treated with mung bean nuclease. The fragment was ligated to 
BamHI-cleaved and mung bean nuclease-treated pHE55 DNA. The ligation mixture 
was transformed into E. coli S17-1. Selection took place on tetracycline-containing 
LB medium. Tetracycline-resistant transformants, whose hybrid plasmid (pHEvdfeA) 
contained the inactivated gene vdhA, were obtained. 

Following digestion with EcoKL, the inactivated gene aatA was isolated from plasmid 
pSKaafA and treated with mung bean nuclease. The fragment was ligated to JBamHI- 
cleaved and mung bean nuclease-treated pHE55 DNA. The ligation mixture was 
transformed into E. coli S17-1. Selection took place on tetracycline-containing LB 
medium. Tetracycline-resistant transformants, whose hybrid plasmid (pHEaafA) 
contained the inactivated gene aat A, were obtained. 
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Example 6 

Generating mutants of the strain Pseudomonas sp- HR199 in which genes of 
eugenol catabolism have been specifically inactivated by inserting an Q-element. 

The strain Pseudomonas sp. HR199 was employed as the recipient in conjugation 
experiments in which strains of E. coli S17-1 harbouring the hybrid plasmids of 
pSUP202 which are listed below were used as donors. The transconjugants were 
selected on gluconate-containing mineral medium which contained the antibiotic 
corresponding to the Q element. It was possible to distinguish between 
"homogenotic" (replacement of the intact gene with the Q. element insertion- 
inactivated gene by means of a double crossover) and "heterogenotic" (integration of 
the hybrid plasmid into the genome by means of a single crossover) transconjugants 
on the basis of the pSUP202-encoded tetracycline resistance. 

The mutants Pseudomonas sp. HR199 fcsQXm and Pseudomonas sp. HR199 
fcsQGm were obtained after conjugating Pseudomonas sp. HR199 with E. coli S17-1 
(pSUP/csQKm) and E. coli S17-1 (pSUP/cjQGm), respectively. The replacement of 
the intact fcs gene with the £2Km-inactivated or £2Gm-inactivated gene (fcsQKm and 
fcsQGm, respectively) was verified by means of DNA sequencing. 

The mutants Pseudomonas sp. HR199 echQKm and Pseudomonas sp. HR199 
echtlGm were obtained after conjugating Pseudomonas sp. HR199 with E. coli 
S17-1 (pSUPec/i&Km) and E. coli S17-1 (pSUP<?c/iQGm), respectively. The 
replacement of the intact ech gene with the QKm-inactivated or £2Gm-inactivated 
gene (echQKm and echQGm, respectively) was verified by means of DNA 
sequencing. 

The mutants Pseudomonas sp. HR199 vdhQKm and Pseudomonas sp. HR199 
vdhQGrn were obtained after conjugating Pseudomonas sp. HR199 with E. coli 
S17-1 (pSUPvdfcQKm) and E. coli S17-1 (pSUPvrf/i^Gm), respectively. The 
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replacement of the intact vdh gene with the QKm-inactivated or OGm-inactivated 
gene (vdhQKm and vdhQGrn, respectively) was verified by means of DNA 
sequencing. 

5 The mutants Pseudomonas sp. HR199 aatQKm and Pseudomonas sp. HR199 
aatQGm were obtained after conjugating Pseudomonas sp. HR199 with E, coli 
S17-1 (pSUPaa/QKm) and E. coli S17-1 (pSUPaafQGm), respectively. The 
replacement of the intact aat gene with the £2Km-inactivated or QGm-inactivated 
gene (aatQKm and aatQGm, respectively) was verified by means of DNA 

10 sequencing. 

The mutant Pseudomonas sp. HR199 fcsQKmvdhQGrn was obtained after 
conjugating Pseudomonas sp. HR199/cs£2Km with E. coli S17-1 (pSUPv<M2Gm). 
The replacement of the intact vdh gene with the £2Gm~inactivated gene (vdhQGm) 
15 was verified by means of DNA sequencing. 

The mutant Pseudomonas sp. HR199 vdhQKmaatQGm was obtained after 
conjugating Pseudomonas sp. HR199 vdhQKm with E. coli S17-1 (pSUPaafQGm). 
The replacement of the intact aat gene with the QGm-inactivated gene (aatQGm) 
20 was verified by means of DNA sequencing. 

The mutant Pseudomonas sp. HR199 vdhQKmechQGm was obtained after 
conjugating Pseudomonas sp. HR199 vdhQKm with E. coli S17-1 (pSUPec/iQGm). 
The replacement of the intact ech gene with the QGm-inactivated gene (echQGm) 
25 was verified by means of DNA sequencing. 
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Example 7 

Generating of mutants of the strain Pseudomonas sp. HR199 in which genes of 
eugenol catabolism have been specifically inactivated by deleting a constituent 
5 region. 

The strains Pseudomonas sp. HR199/as-QKm, Pseudomonas sp. HR199 echQKm, 
Pseudomonas sp. HR199 vdhQKm and Pseudomonas sp. HR199 aatQKm were 
employed as recipients in conjugation experiments in which strains of £. coli S17-1 
harbouring the hybrid plasmids of pHE55 which are listed below were used as 

10 donors. The "heterogenotic" transconjugants were selected on gluconate-containing 
mineral medium which also contained the antibiotic corresponding to the Q element 
in addition to tetracycline (pHE55-encoded resistance). After streaking out on 
sucrose-containing mineral medium, transconjugants were obtained which had 
eliminated the vector DNA by means of a second recombination event (second 

15 crossover). By streaking out on mineral medium which was without antibiotic or 
which contained the antibiotic corresponding to the Q element, it was possible to 
identify the mutants in which the Q element-inactivated gene had been replaced with 
the deletion-inactivated gene (no antibiotic resistance). 

20 The mutant Pseudomonas sp. HR199 fcsA was obtained after conjugating 
Pseudomonas sp. HR199/cs£2Km with E. coli S17-1 (pHEfcsA). The replacement of 
the QKm inactivated gene (fcstlKm) with the deletion-inactivated gene (fcsA) was 
verified by means of DNA sequencing. 

25 The mutant Pseudomonas sp. HR199 echA was obtained after conjugating 
Pseudomonas sp. HR199 echQKm with E. coli S17-1 (pHEecM). The replacement 
of the £2Km-inactivated gene (echQKm) with the deletion-inactivated gene (echA) 
was verified by means of DNA sequencing. 

30 The mutant Pseudomonas sp. HR199 vdhA was obtained after conjugating 
Pseudomonas sp. HR199 vdhQKm with E. coli S17-1 (pHEvdhA). The replacement 
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of the OKm-inactivated gene (vdhQKm) with the deletion-inactivated gene (vdhA) 
was verified by means of DNA sequencing. 

The mutant Pseudomonas sp. HR199 aatA was obtained after conjugating 
5 Pseudomonas sp. HR199 aatOKm with E. coli S17-1 (pHEaatA). The replacement 
of the QKm-inactivated gene (aatQKm) with the deletion-inactivated gene (aatA) 
was verified by means of DNA sequencing. 

Example 8 

10 

Biotransforming eugenol into vanillin using the mutant Pseudomonas sp. HR199 
vdhQKm. 

The strain Pseudomonas sp. HR199 vdhQKm was propagated in 50 ml of HR-MM 
containing 6 mM eugenol up to an optical density of approx. OD600nm = 0.6. After 
15 17 h, it was possible to detect 2.9 mM vanillin, 1.4 mM ferulic acid and 0.4 mM 
vanillic acid in the culture supernatant. 

Example 9 

20 Biotransforming eugenol into ferulic acid using the mutant Pseudomonas sp. 
HR199 vdhSlGmaatOKm. 

The strain Pseudomonas sp. HR199 vdhQGmaatQXm was propagated in 50 ml of 
HR-MM containing 6 mM eugenol up to an optical density of approx. OD600nm = 
0.6. After 18 h, it was possible to detect 1.9 mM vanillin, 2.4 mM ferulic acid and 
25 0.6 mM vanillic acid in the culture supernatant. 
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Example 10 

Biotransforming eugenol into coniferyl alcohol using the mutant Pseudomonas 
sp. HR199 vdhQGmaatQKm. 

5 The strain Pseudomonas sp. HR199 vdhQGmaatQKm was propagated in 50 ml of 
HR-MM containing 6 mM eugenol up to an optical density of approx. OD600nm = 
0.4. After 15 h, it was possible to detect 1.7 mM coniferyl alcohol, 1.4 mM vanillin, 
1.4 mM ferulic acid and 0.2 mM vanillic acid in the culture supernatant. 

10 Example 11 

Fermentatively producing natural vanillin from eugenol in a 10 1 fermenter 
using mutant Pseudomonas sp. HR 199 vdhOKm. 

The production fermenter was inoculated with 100 ml of a 24-hour-old preliminary 
15 culture which had been propagated at 32°C on a shaking incubator (120 rpm) in a 
medium which was adjusted to pH 7.0 and which consisted of 12.5 g of glycerol/1, 
10 g of yeast extract/1 and 0.37 g of acetic acid/1. The fermenter contained 9.9 1 of 
medium of the following composition: 1.5 g of yeast extract/1, 1.6 g of KH2PO4/I, 0.2 
g of NaCl/1, 0.2 g of MgS0 4 /l. The pH was adjusted to pH 7.0 with sodium 
20 hydroxide solution. After sterilization, 4 g of eugenol were added to the medium. The 
temperature was 32°C, the aeration was 3 Nl/min and the stirrer speed was 600 rpm. 
The pH was maintained at pH 6.5 with sodium hydroxide solution. 

At 4 hours after the inoculation, continuous addition of eugenol was begun such that 
25 255 g of eugenol had been added to the culture when fermentation ended after 65 
hours. 40 g of yeast extract were also fed in during the fermentation. At the end of the 
fermentation, the concentration of eugenol was 0.2 g/1. The content of vanillin was 
2.6 g/1. 3.4 g of ferulic acid/1 were also present. 
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The vanillin which is obtained in this way can be isolated by known physical 
methods such as chromatography, distillation and/or extraction and used for 
preparing natural flavourings. 

5 Explanatory notes regarding the figures: 

FIG. la to Ir: 

Gene struktures for isolating organisms and mutants 

10 

calA*: Part of the inactivated gene for coniferyl alcohol dehydrogenase 
calB*: Part of the inactivated gene for coniferyl aldehyde dehydrogenase 
fcs*: Part of the inactivated gene for feruloyl-CoA synthetase 
ech*: Part of the inactivated gene for enoyl-CoA hydratase-aldolase 
15 vdh*: Part of the inactivated gene for vanillin dehydrogenase 
aat*: Part of the inactivated gene for beta-ketothiolase 

While the restriction enzyme cleavage sites labelled "*" were used for the 
construction, they are no longer functional in the resulting construct. 

20 
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FIG. 2a: Nucleotide sequence of the calAQKm gene structure 
FIG. 2b: Nucleotide sequence of the calAQGm gene structure: 
FIG. 2c: Nucleotide sequence of the calAA gene structure 
FIG. 2d: Nucleotide sequence of the calBQKm gene structure 

5 FIG. 2e: Nucleotide sequence of the calBSIGm gene structure 
FIG. 2f: Nucleotide sequence of the calBA gene structure 
FIG. 2g: Nucleotide sequence of the fcsQKm gene structure 
FIG. 2h: Nucleotide sequence of the fcsQGm gene structure 
FIG. 2i: Nucleotide sequence of the fcsA gene structure 

10 FIG. 2i: Nucleotide sequence of the echQKm gene structure 
FIG. 2k: Nucleotide sequence of the echQGm gene structure 
FIG. 21: Nucleotide sequence of the echA gene structure 
FIG. 2m: Nucleotide sequence of the vdhQKm gene structure 
FIG. 2n: Nucleotide sequence of the vdhQGm gene structure 

15 FIG. 2o: Nucleotide sequence of the vdhA gene structure 

FIG. 2p: Nucleotide sequence of the aatQKm gene structure 
FIG. 2q: Nucleotide sequence of the aatQGm gene structure 
FIG. 2r: Nucleotide sequence of the aatA gene structure 
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Patent claims 

1. Transformed and/or mutagenized unicellular or multicellular organism which 
is characterized in that enzymes of eugenol and/or ferulic acid catabolism are 

5 inactivated such that the intermediates coniferyl alcohol, coniferyl aldehyde, 

ferulic acid, vanillin and/or vanillic acid accumulate. 

2. Organism according to Claim 1, characterized in that eugenol and/or ferulic 
acid catabolism is altered by inserting £2 elements, or introducing deletions, 

10 into corresponding genes. 

3. Organism according to either Claim 1 or 2, characterized in that one or more 
genes encoding the enzymes coniferyl alcohol dehydrogenases, coniferyl 
aldehyde dehydrogenases, feruloyl-CoA synthetases, enoyl-CoA hydratase- 

15 aldolases, beta-ketothiolases, vanillin dehydrogenases or vanillic acid 

demethylases is/are altered and/or inactivated. 

4. Organism according to one of Claims 1 to 3, characterized in that it is 
unicellular, preferably a microorganism or a plant or animal cell. 

20 

5. Organism according to one of Claims 1 to 4, characterized in that it is a 
bacterium, preferably a Pseudomonas species. 

6. Gene structures in which the nucleotide sequences encoding the enzymes 
25 coniferyl alcohol dehydrogenases, coniferyl aldehyde dehydrogenases, 

feruloyl-CoA synthetases, enoyl-CoA hydratase-aldolases, beta-ketothiolases, 
vanillin-dehydrogenases or vanillic acid demethylases, or two or more of 
these enzymes, are altered and/or inactivated. 



30 



7. 



Gene structures having the sequences given in Figures la to lr. 
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8. Gene structures having the sequences given in Figures 2a to 2r. 

9. Vectors which contain at least one gene structure according to one of Claims 
6 to 8. 

5 

10. Transformed organism according to one of Claims 1 to 5, characterized in that 
it harbours at least one vector according to Claim 9. 

11. Organism according to one of Claims 1 to 5, characterized in that it contains 
10 at least one gene structure according to one of Claims 6 to 8 which is 

integrated into the genome instead of the respective intact gene. 

12. Process for the biotechnological preparation of organic compounds, in 
particular alcohols, aldehydes and organic acids, characterized in that an 

15 organism according to one of Claims 1 to 5 or 10 to 11 is employed. 

13. Process for preparing the organisms according to one of Claims 1 to 5, 
characterized in that the alteration eugenol and/or ferulic acid catabolism is 
achieved by means of microbiological culturing methods which are known 

20 per se. 

14. Process for preparing an organism according to one of Claims 1 to 5 or 10 to 
11, characterized in that the alteration in eugenol and/or ferulic acid 
catabolism, and/or the inactivation of the corresponding genes, is achieved by 

25 means of recombinant DNA methods. 



15. 



Use of the organisms according to one of Claims 1 to 5 or 10 to 11 for 
preparing coniferyl alcohol, coniferyl aldehyde, ferulic acid, vanillin and/or 
vanillic acid. 
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Use of gene structures according to one of Claims 6 to 8 or of a vector 
according to Claim 9 for preparing transformed and/or mutagenized 
organisms. 



Sequences 

CTGCAGCCAG GGC TGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTC TTGG GCGCGGTCTT GGAGAGTTCG 120 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CACGGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 300 

GCATGGAAAT GGCATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGCGGAA 3 60 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC C GTAAAAAGG 420 

AAAGAGCATG CAA CTG ACC AAC AAG AAA ATC GTC GTC ACC GGA GTG TCC TCC 472 
Met Gin Leu Thr Asn Lys Lys lie Val Val Thr Gly Val Ser Ser 
15 10 15 

GGT ATC GGT GCC GAA ACT GCC CGC GTT CTG CGC TCT CAC GGC GCC ACA 520 
Gly He Gly Ala Glu Thr Ala Arg Val Leu Arg Ser His Gly Ala Thr 
20 25 30 

GTG ATT GGC GTA GAT CGC AAC ATG CCG AGC CTG ACT CTG GAT GCT TTC 568 
Val He Gly Val Asp Arg Asn Met Pro Ser Leu Thr Leu Asp Ala Phe 
35 40 45 

GTT CAG GCT GAC CTG AGC CAT CCT GAA GGC ATC GAT AAG GCC ATC GGG 616 
Val Gin Ala Asp Leu Ser His Pro Glu Gly He Asp Lys Ala He 
50 55 60 62 

ACAGCAAGCG AACCGGAATT GCCAGCTGGG GCGCCCTCTG GTAAGGTTGG GAAGCCCTGC 676 

AAAGTAAACT GGATGGCTTT CTTGCCGCCA AGGATC TGAT GGCGCAGGGG ATCAAGATCT 73 6 

GATCAAGAGA CAGGATGAGG ATCGTTTCGC ATG ATT GAA CAA GAT GGA TTG CAC 7 90 

Met He Glu Gin Asp Gly Leu His 
1 5 

GCA GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC GGC TAT GAC TGG 838 
Ala Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp 
10 15 20 

GCA CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG TTC CGG CTG TCA 886 
Ala Gin Gin Thr He Gly Cys Ser Asp Ala Ala Val Phe Arg Leu Ser 
25 30 35 40 

GCG CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC CTG TCC GGT GCC 934 

Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala 
45 50 55 

CTG AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG TGG CTG GCC ACG 982 
Leu Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr 
60 65 70 



-2- 



ACG GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC ACT GAA GCG GGA 1030 
Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala Gly 
75 80 85 

AGG GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG GAT CTC CTG TCA 107 8 

Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin Asp Leu Leu Ser 
90 95 100 

TCT CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG GCT GAT GCA ATG 1126 
Ser His Leu Ala Pro Ala Glu Lys Val Ser lie Met Ala Asp Ala Met 
105 110 115 120 

CGG CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA TTC GAC CAC CAA 1174 
Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin 
125 130 135 

GCG AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG GAA GCC GGT CTT 1222 
Ala Lys His Arg lie Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu 
140 145 150 

GTC GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG CTC GCG CCA GCC 127 0 

Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro Ala 
155 160 165 

GAA CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC GGC GAG GAT CTC 1318 
Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu 
170 175 180 

GTC GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC ATG GTG GAA AAT 13 66 

Val Val Thr His Gly Asp Ala Cys Leu Pro Asn lie Met Val Glu Asn 
185 190 195 200 

GGC CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG GGT GTG GCG GAC 1414 
Gly Arg Phe Ser Gly Phe lie Asp Cys Gly Arg Leu Gly Val Ala Asp 
205 210 215 

CGC TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT GCT GAA GAG CTT 1462 
Arg Tyr Gin Asp lie Ala Leu Ala Thr Arg Asp lie Ala Glu Glu Leu 
220 225 230 

GGC GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC GGT ATC GCC GCT 1510 
Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly lie Ala Ala 
235 240 245 

CCC GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT GAC GAG TTC TTC 1558 
Pro Asp Ser Gin Arg lie Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe 
250 255 260 264 

TGAGC GGGAC TCTGGGGTTC GAAATGACCG ACCAAGC GAC GCCCTG GCC *GCG GTG 1613 

Ala Ala Val 
225 

ATT GCA TTC ATG TGT GCT GAG GAG TCA CGT TGG ATC AAC GGC ATA AAT 1661 
lie Ala Phe Met Cys Ala Glu Glu Ser Arg Trp lie Asn Gly lie Asn 
230 235 240 
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ATT CCA GTG GAC GGA GGT TTG GCA TCG ACC TAC GTG TAA GTTCGTGGAC 1710 
lie Pro Val Asp Gly Gly Leu Ala Ser Thr Tyr Val 
245 250 255 

GCCCTTTGCA CGCGCACTAT ATCTCTATGC AGCAGCTGAA AGCAGCTTTG GTTTTGATCG 1770 

GAGGTAGCGG GCGGAAAGGT GCAGAATGTC TAAATAATAA AGGATTCTTG TGAAGCTTTA 1830 

GTTGTCCGTA AACGAAAATA AAAATAAAGA GGAATGATAT GAAAGCAAGT AGATCAGTCT 1890 

GCACTTTCAA AATAGCTACC CTGGCAGGCG CCATTTATGC AGCGCTGCCA ATGTCAGCTG 1950 

CAAACTCGAT GCAGCTGGAT GTAGGTAGCT CGGATTGGAC GGTGCGTTGG GGACAACACC 2 010 

CTCAAGTATA GCCTTGCCTC TCGCCTGAAT GAGCAAGACT CAAGTCTGAC AAATGCGCCG 2070 

ACTGTCAATG GTTATATCCG GATATTCAAA GTCAGGGTGA TCGTAACTTT GACCGGGGGC 213 0 

TTGGTATCCA ATCGTCTCGA TATTCTGGCT GCAG 2164 

FIG. 2a: 
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CTGCAGCCAG GGCTGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTCTTGG GCGCGGTCTT GGAGAGTTCG 120 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CAC GGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 300 

GCATGGAAAT GGCATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGCGGAA 360 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC CGTAAAAAGG 420 

AAAGAGCATG CAA CTG ACC AAC AAG AAA ATC GTC GTC ACC GGA GTG TCC TCC 472 
Met Gin Leu Thr Asn Lys Lys He Val Val Thr Gly Val Ser Ser 
15 10 15 

GGT ATC GGT GCC GAA ACT GCC CGC GTT CTG CGC TCT CAC GGC GCC ACA 520 
Gly He Gly Ala Glu Thr Ala Arg Val Leu Arg Ser His Gly Ala Thr 
20 25 30 

GTG ATT GGC GTA GAT CGC AAC ATG CCG AGC CTG ACT CTG GAT GCT TTC 568 
Val He Gly Val Asp Arg Asn Met Pro Ser Leu Thr Leu Asp Ala Phe 
35 40 45 

GTT CAG GCT GAC CTG AGC CAT CCT GAGGGGAGAG GCGGTTTGCG TATTGGGCGC 622 
Val Gin Ala Asp Leu Ser His Pro 
50 55 

ATGCATAAAA AC TGTTGT AA TTCATTAAGC ATTCTGCCGA CATGGAAGCC ATCACAAACG 682 

GCATGATGAA CCTGAATCGC CAGCGGCATC AGCACCTTGT CGCCTTGCGT ATAATATTTG 742 

CCCATGGACG CACACCGTGG AAACGGATGA AGGCACGAAC CCAGTTGACA TAAGCCTGTT 8 02 

CGGTTCGTAA ACTGTAATGC AAGTAGC GTA TGCGCTCACG CAACTGGTCC AGAACCTTGA 8 62 

CCGAACGCAG CGGTGGTAAC GGCGCAGTGG CGGTTTTCAT GGCTTGTTAT GACTGTTTTT 922 

TTGTACAGTC TATGCCTCGG GCATCCAAGC AGCAAGCGCG TTACGCCGTG GGTCGATGTT 982 

TGATGTTATG GAGCAGCAAC G ATG TTA CGC AGC AGC AAC GAT GTT ACG CAG 1033 

Met Leu Arg Ser Ser Asn Asp Val Thr Gin 
15 10 

CAG GGC AGT CGC CCT AAA ACA AAG TTA GGT GGC TCA AGT ATG GGC ATC 1081 
Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly Ser Ser Met Gly He 
15 20 25 

ATT CGC ACA TGT AGG CTC GGC CCT GAC CAA GTC AAA TCC ATG CGG GCT 1129 
He Arg Thr Cys Arg Leu Gly Pro Asp Gin Val Lys Ser Met Arg Ala 
30 35 40 

GCT CTT GAT CTT TTC GGT CGT GAG TTC GGA GAC GTA GCC ACC TAC TCC 1177 
Ala Leu Asp Leu Phe Gly Arg Glu Phe Gly Asp Val Ala Thr Tyr Ser 
45 50 55 



-5- 



CAA CAT CAG CCG GAC TCC GAT TAC CTC GGG AAC TTG CTC CGT AGT AAG 1225 
Gin His Gin Pro Asp Ser Asp Tyr Leu Gly Asn Leu Leu Arg Ser Lys 
60 65 70 

ACA TTC ATC GCG CTT GCT GCC TTC GAC CAA GAA GCG GTT GTT GGC GCT 1273 
Thr Phe lie Ala Leu Ala Ala Phe Asp Gin Glu Ala Val Val Gly Ala 
75 80 85 90 

CTC GCG GCT TAC GTT CTG CCC AGG TTT GAG CAG CCG CGT AGT GAG ATC 1321 
Leu Ala Ala Tyr Val Leu Pro Arg Phe Glu Gin Pro Arg Ser Glu lie 
95 100 105 

TAT ATC TAT GAT CTC GCA GTC TCC GGC GAG CAC CGG AGG CAG GGC ATT 1369 
Tyr lie Tyr Asp Leu Ala Val Ser Gly Glu His Arg Arg Gin Gly lie 
110 115 120 

GCC ACC GCG CTC ATC AAT CTC CTC AAG CAT GAG GCC AAC GCG CTT GGT 1417 
Ala Thr Ala Leu lie Asn Leu Leu Lys His Glu Ala Asn Ala Leu Gly 
125 130 135 

GCT TAT GTG ATC TAC GTG CAA GCA GAT TAC GGT GAC GAT CCC GCA GTG 1465 
Ala Tyr Val lie Tyr Val Gin Ala Asp Tyr Gly Asp Asp Pro Ala Val 
140 145 150 

GCT CTC TAT ACA AAG TTG GGC ATA CGG GAA GAA GTG ATG CAC TTT GAT 1513 
Ala Leu Tyr Thr Lys Leu Gly lie Arg Glu Glu Val Met His Phe Asp 
155 160 165 170 

ATC GAC CCA AGT ACC GCC ACC TAA CAATTCGTTC AAGCC GAGAT CGGCTTCCCT 1567 
lie Asp Pro Ser Thr Ala Thr 
175 177 

G ATT GCA TTC ATG TGT GCT GAG GAG TCA CGT TGG ATC AAC GGC ATA AAT 1616 
lie Ala Phe Met Cys Ala Glu Glu Ser Arg Trp lie Asn Gly lie Asn 
228 230 235 240 

ATT CCA GTG GAC GGA GGT TTG GCA TCG ACC TAC GTG TAA GTTCGTGGAC 1665 
lie Pro Val Asp Gly Gly Leu Ala Ser Thr Tyr Val 
245 250 255 

GCCCTTTGCA CGCGCACTAT ATC T CTATGC AGCAGCTGAA AGCAGCTTTG GTTTTGATCG 1725 

GAGGTAGCGG GCGGAAAGGT GCAGAATGTC TAAATAATAA AGGATTCTTG TGAAGCTTTA 1785 

GTTGTCCGTA AACGAAAATA AAAATAAAGA GGAATGATAT GAAAGCAAGT AGATCAGTCT 1845 

GCACTTTCAA AATAGCTACC CTGGCAGGCG CCATTTATGC AGCGCTGCCA ATGTCAGCTG 1905 

CAAACTCGAT GCAGCTGGAT GTAGGTAGCT CGGATTGGAC GGTGCGTTGG GG AC AAC ACC 1965 

C TC AAGTATA GCCTTGCCTC TCGCCTGAAT GAGCAAGACT CAAGTCTGAC AAATGCGCCG 2025 

ACTGTCAATG GTTATATCCG GATATTCAAA GTCAGGGTGA TCGTAACTTT GACCGGGGGC 2085 



TTGGTATCCA ATCGTCTCGA TATTCTGGCT GCAG 
FIG. 2b: 



2119 



-6- 



CTGCAGCCAG GGCTGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTCTTGG GCGCGGTCTT GGAGAGTTCG 120 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CACGGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 3 00 

GCATGGAAAT GGCATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGCGGAA 360 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC CGTAAAAAGG 420 

AAAGAGCATG CAA CTG ACC AAC AAG AAA ATC GTC GTC ACC GGA GTG TCC TCC 472 
Met Gin Leu Thr Asn Lys Lys lie Val Val Thr Gly Val Ser Ser 
15 10 15 

GGT ATC GGT GCC GAA ACT GCC CGC GTT CTG CGC TCT CAC GGC GCC ACA 52 0 

Gly He Gly Ala Glu Thr Ala Arg Val Leu Arg Ser His Gly Ala Thr 
20 25 30 

GTG ATT GGC GTA GAT CGC AAC ATG CCG AGC CTG ACT CTG GAT GCT TTC 568 
Val He Gly Val Asp Arg Asn Met Pro Ser Leu Thr Leu Asp Ala Phe 
35 40 45 

GTT CAG GCT GAC CTG AGC CAT CCT GAA GGC ATC GATC AAC GGC ATA AAT 617 
Val Gin Ala Asp Leu Ser His Pro Glu Gly He Asn Gly He Asn 

50 55 58 240 

ATT CCA GTG GAC GGA GGT TTG GCA TCG ACC TAC GTG TAA GTTCGTGGAC 666 
He Pro Val Asp Gly Gly Leu Ala Ser Thr Tyr Val 
245 250 255 

GCCCTTTGCA CGC GC AC TAT ATCTCTATGC AGC AGC TGAA AGCAGCTTTG GTTTTGATCG 726 

GAGGTAGCGG GCGGAAAGGT GCAGAATGTC TAAATAATAA AGGATTCTTG TGAAGCTTTA 786 

GTTGTCCGTA AACGAAAATA AAAATAAAGA GGAATGATAT GAAAGCAAGT AGATCAGTCT 846 

GCACTTTCAA AATAGCTACC CTGGCAGGCG CCATTTATGC AGCGCTGCCA ATGTCAGCTG 906 

CAAACTCGAT GCAGCTGGAT GTAGGTAGCT CGGATTGGAC GGTGCGTTGG GG AC AAC ACC 966 

CTCAAGTATA GCCTTGCCTC TCGCCTGAAT GAGCAAGACT CAAGTCTGAC AAATGCGCCG 102 6 

ACTGTCAATG GTTATATCCG GATATTCAAA GTCAGGGTGA TCGTAACTTT GACCGGGGGC 1086 

TTGGTATCCA ATCGTCTCGA TATTCTGGCT GCAG 1120 

FIG. 2c; 
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GAATTCCGCG TATCGCCCGG TTCTATCAGC GGGCCGCTTT CGAAAGTCAT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TC GTTGAC AT AGGGCAGAGG 120 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 180 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGC GGCG TCGAAGC GAT GCTCCACTAC 3 00 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 3 60 

CTCCAGCTCA AGGGCAATTT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 420 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CG ATG AGC ATT CTT GGT TTG AAT 473 

Met Ser lie Leu Gly Leu Asn 
1 5 

GGT GCC CCG GTC GGA GCT GAG CAG CTG GGC TCG GCT CTT GAT CGC ATG 521 
Gly Ala Pro Val Gly Ala Glu Gin Leu Gly Ser Ala Leu Asp Arg Met 
10 15 20 

AAG AAG GCG CAC CTG GAG CAG GGG CCT GCA AAC TTG GAG CTG CGT CTG 569 
Lys Lys Ala His Leu Glu Gin Gly Pro Ala Asn Leu Glu Leu Arg Leu 
25 30 35 

AGT AGG CTG GAT CGT GCG ATT GCA ATG CTT CTG GAA AAT CGT GAA GCA 617 
Ser Arg Leu Asp Arg Ala lie Ala Met Leu Leu Glu Asn Arg Glu Ala 
40 45 50 55 

ATT GCC GAC GCG GTT TCT GCT GAC TTT GGC AAT CGC AGC CGT GAG CAA 665 
lie Ala Asp Ala Val Ser Ala Asp Phe Gly Asn Arg Ser Arg Glu Gin 
60 65 70 

ACA CTG CTT TGC GAC ATT GCT GGC TCG GTG GCA AGC CTG AAG GAT AGC 713 
Thr Leu Leu Cys Asp He Ala Gly Ser Val Ala Ser Leu Lys Asp Ser 
75 80 85 

CGC GAG CAC GTG GCC AAA TGG ATG GAG CCC GAA CAT CAC AAG GCG ATG 7 61 

Arg Glu His Val Ala Lys Trp Met Glu Pro Glu His His Lys Ala Met 
90 95 100 

TTT CCA GGG GCG GAG GCA CGC GTT GAG TTT CAG CCG CTG GGT GTC GTT 809 
Phe Pro Gly Ala Glu Ala Arg Val Glu Phe Gin Pro Leu Gly Val Val 
105 110 115 

GGG GTC ATT AGT CCC TGG AAC TTC CCT ATC GTA CTG GCC TTT GGG CCG 857 
Gly Val He Ser Pro Trp Asn Phe Pro He Val Leu Ala Phe Gly Pro 
120 125 130 135 

CTG GCC GGC ATA TTC GCA GCA GGT AAT CGC GCC ATG CTC AAG CCG TCC 905 
Leu Ala Gly He Phe Ala Ala Gly Asn Arg Ala Met Leu Lys Pro Ser 
140 145 150 

GAG CTT ACC CCG CGG ACT TCT GCC CTG CTT GCG GAG CTA ATT GCT CGT 953 
Glu Leu Thr Pro Arg Thr Ser Ala Leu Leu Ala Glu Leu He Ala Arg 
155 160 165 
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TAC TTC GAT GAA ACT GAG CTG ACT ACA GTG CTG GGC GAC GCT GAA GTC 1001 
Tyr Phe Asp Glu Thr Glu Leu Thr Thr Val Leu Gly Asp Ala Glu Val 
170 175 180 

GGT GCG CTG TTC AGT GCT CAG CCT TTC GAT CAT CTG ATC TTC ACC GGC 1049 
Gly Ala Leu Phe Ser Ala Gin Pro Phe Asp His Leu lie Phe Thr Gly 
185 190 195 

GGC ACT GCC GTG GCC AAG CAC ATC ATG CGT GCC GCG GCG GAT AAC CTA 1097 
Gly Thr Ala Val Ala Lys His lie Met Arg Ala Ala Ala Asp Asn Leu 
200 205 210 215 

GTG CCC GTT ACC CTG GAA TTG GGT GGC AAA TCG CCG GTG ATC GTT TCC 1145 
Val Pro Val Thr Leu Glu Leu Gly Gly Lys Ser Pro Val lie Val Ser 
220 225 230 

CGC AGT GCA GAT ATG GCG GAC GTT GCA CAA CGG GTG TTG ACG GTG AAA 1193 
Arg Ser Ala Asp Met Ala Asp Val Ala Gin Arg Val Leu Thr Val Lys 
235 240 245 

ACC TTC AAT GCC GGG CAA ATC TGT CTG GCA CCG GAC TAT GTG CTG CTG 1241 
Thr Phe Asn Ala Gly Gin lie Cys Leu Ala Pro Asp Tyr Val Leu Leu 
250 255 260 

CCG GAA GGGACAGCAA GCGAACCGGA ATTGCCAGCT GGGGCGCCCT CTGGTAAGGT 1297 
Pro Glu 
265 

TGGGAAGCCC TGCAAAGTAA ACTGGATGGC TTTCTTGCCG CCAAGGATCT GATGGCGCAG 1357 

GGGATCAAGA TCTGATCAAG AGACAGGATG AGGATC GTTT CGC ATG ATT GAA CAA 1412 

Met lie Glu Gin 
1 

GAT GGA TTG CAC GCA GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC 1460 
Asp Gly Leu His Ala Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe 
5 10 15 20 

GGC TAT GAC TGG GCA CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG 1508 
Gly Tyr Asp Trp Ala Gin Gin Thr lie Gly Cys Ser Asp Ala Ala Val 
25 30 35 

TTC CGG CTG TCA GCG CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC 1556 
Phe Arg Leu Ser Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp 
40 45 50 

CTG TCC GGT GCC CTG AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG 1604 
Leu Ser Gly Ala Leu Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser 
55 60 65 

TGG CTG GCC ACG ACG GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC 1652 
Trp Leu Ala Thr Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val 
70 75 80 
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ACT GAA GCG GGA AGG GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG 17 00 

Thr Glu Ala Gly Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin 
85 90 95 100 

GAT CTC CTG TCA TCT CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG 1748 
Asp Leu Leu Ser Ser His Leu Ala Pro Ala Glu Lys Val Ser lie Met 
105 110 115 

GCT GAT GCA ATG CGG CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA 179 6 

Ala Asp Ala Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro 
120 125 130 

TTC GAC CAC CAA GCG AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG 1844 
Phe Asp His Gin Ala Lys His Arg lie Glu Arg Ala Arg Thr Arg Met 
135 140 145 

GAA GCC GGT CTT GTC GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG 1892 
Glu Ala Gly Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly 
150 155 160 

CTC GCG CCA GCC GAA CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC 1940 
Leu Ala Pro Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp 
165 170 175 180 

GGC GAG GAT CTC GTC GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC 198 8 

Gly Glu Asp Leu Val Val Thr His Gly Asp Ala Cys Leu Pro Asn lie 
185 190 195 

ATG GTG GAA AAT GGC CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG 2 03 6 

Met Val Glu Asn Gly Arg Phe Ser Gly Phe lie Asp Cys Gly Arg Leu 
200 205 210 

GGT GTG GCG GAC CGC TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT 2 084 

Gly Val Ala Asp Arg Tyr Gin Asp lie Ala Leu Ala Thr Arg Asp lie 
215 220 225 

GCT GAA GAG CTT GGC GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC 2132 
Ala Glu Glu Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr 
230 235 240 

GGT ATC GCC GCT CCC GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT 2180 
Gly lie Ala Ala Pro Asp Ser Gin Arg lie Ala Phe Tyr Arg Leu Leu 
245 250 255 260 

GAC GAG TTC TTC TGA GCGGGACTCT GGGGTTCGAA ATGACCGACC AAGCGACGCC 2235 
Asp Glu Phe Phe 
264 

CGC CAT GCC AAG CCT GTT CTC GTG CAA AGT CCT GTG GGT GAG TCG AAC 2283 
His Ala Lys Pro Val Leu Val Gin Ser Pro Val Gly Glu Ser Asn 
444 445 450 455 

TTG GCG ATG CGC GCA CCC TAC GGA GAA GCG ATC CAC GGA CTG CTC TCT 2331 
Leu Ala Met Arg Ala Pro Tyr Gly Glu Ala lie His Gly Leu Leu Ser 
460 465 470 



f 
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GTC CTC CTT TCA ACG GAG TGT TAG AAC CGTTGGT AGTGGTTTTG GACGGGCCCA 2385 
Val Leu Leu Ser Thr Glu Cys 
475 480 481 

GGAGCATGCG CTTCTGGGCC CGTTTCTTGA GTATTCATTG G ATAGTC AC G CGTGGTAGCT 2445 

TCGAGCCTGC ACAGCTGATG AGCACCCTGG AAGGCGCGCT GTACGCGGAC GACTGGGTTC 2505 

ATCTTCGCCA TTCATGACGG AACTCCGTTC CCCAGTACCG CGATGACTAT TTTGCCTCTT 2 565 

CCGATGTCCG ATTCCACGCC GCCTGACGCT AAGCGGGGGC GGGGGCGCCC GCATCCCAGC 2 625 

CCAGACAGCA ACAAATGAGT AGGCTCTTGG ATGCCGCGGC GGCTGAGATT GGTAACGGCA 2685 

ATTTCGTCAA TGTGACGATG GATTCGATTG CCCGTGCTGC CGGCGTCTCA AAAAAAACGC 2745 

TGTACGTCTT GGTGGCGAGC AAGGAAGAAC TCATTTCCCG GTTAGTGGCT CGAGACATGT 2805 

CCAACCTTGA GGAATTC 2 822 
FIG. 2d: 
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GAATTCCGCG TATCGCCCGG TTCTATCAGC GGGCCGCTTT CGAAAGTCAT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TCGTTGACAT AGGGCAGAGG 120 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 180 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGCGGCG TCGAAGCGAT GCTCCACTAC 300 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 36 0 

CTCCAGCTCA AG GGC AAT TT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 42 0 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CG ATG AGC ATT CTT GGT TTG AAT 473 

Met Ser lie Leu Gly Leu Asn 

1 5 

GGT GCC CCG GTC GGA GCT GAG CAG CTG GGC TCG GCT CTT GAT CGC ATG 521 

Gly Ala Pro Val Gly Ala Glu Gin Leu Gly Ser Ala Leu Asp Arg Met 
10 15 20 

AAG AAG GCG CAC CTG GAG CAG GGG CCT GCA AAC TTG GAG CTG CGT CTG 56 9 

Lys Lys Ala His Leu Glu Gin Gly Pro Ala Asn Leu Glu Leu Arg Leu 
25 30 35 

AGT AGG CTG GAT CGT GCG ATT GCA ATG CTT CTG GAA AAT CGT GAA GCA 617 
Ser Arg Leu Asp Arg Ala lie Ala Met Leu Leu Glu Asn Arg Glu Ala 
40 45 50 55 

ATT GCC GAC GCG GTT TCT GCT GAC TTT GGC AAT CGC AGC CGT GAG CAA 665 
lie Ala Asp Ala Val Ser Ala Asp Phe Gly Asn Arg Ser Arg Glu Gin 
60 65 70 

ACA CTG CTT TGC GAC ATT GCT GGC TCG GTG GCA AGC CTG AAG GAT AGC 713 
Thr Leu Leu Cys Asp lie Ala Gly Ser Val Ala Ser Leu Lys Asp Ser 
75 80 85 

CGC GAG CAC GTG GCC AAA TGG ATG GAG CCC GAA CAT CAC AAG GCG ATG 761 
Arg Glu His Val Ala Lys Trp Met Glu Pro Glu His His Lys Ala Met 
90 95 100 

TTT CCA GGG GCG GAG GCA CGC GTT GAG TTT CAG CCG CTG GGT GTC GTT 809 
Phe Pro Gly Ala Glu Ala Arg Val Glu Phe Gin Pro Leu Gly Val Val 
105 110 115 

GGG GTC ATT AGT CCC TGG AAC TTC CCT ATC GTA CTG GCC TTT GGG CCG 857 
Gly Val lie Ser Pro Trp Asn Phe Pro lie Val Leu Ala Phe Gly Pro 
120 125 130 135 

CTG GCC GGC ATA TTC GCA GCA GGT AAT CGC GCC ATG CTC AAG CCG TCC 9 05 

Leu Ala Gly lie Phe Ala Ala Gly Asn Arg Ala Met Leu Lys Pro Ser 
140 145 150 

GAG CTT ACC CCG CGG ACT TCT GCC CTG CTT GCG GAG CTA ATT GCT CGT 953 
Glu Leu Thr Pro Arg Thr Ser Ala Leu Leu Ala Glu Leu lie Ala Arg 
155 160 165 
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TAC TTC GAT GAA ACT GAG CTG ACT ACA GTG CTG GGC GAC GCT GAA GTC 1001 
Tyr Phe Asp Glu Thr Glu Leu Thr Thr Val Leu Gly Asp Ala Glu Val 
170 175 180 

GGT GCG CTG TTC AGT GCT CAG CCT TTC GAT CAT CTG ATC TTC ACC GGC 1049 
Gly Ala Leu Phe Ser Ala Gin Pro Phe Asp His Leu lie Phe Thr Gly 
185 190 195 

GGC ACT GCC GTG GCC AAG CAC ATC ATG CGT GCC GCG GCG GAT AAC CTA 1097 
Gly Thr Ala Val Ala Lys His lie Met Arg Ala Ala Ala Asp Asn Leu 
200 205 210 215 

GTG CCC GTT ACC CTG GAA TTG GGT GGC AAA TCG CCG GTG ATC GTT TCC 1145 
Val Pro Val Thr Leu Glu Leu Gly Gly Lys Ser Pro Val lie Val Ser 
220 225 230 

CGC AGT GCA GAT ATG GCG GAC GTT GCA CAA CGG GTG TTG ACG GTG AAA 1193 
Arg Ser Ala Asp Met Ala Asp Val Ala Gin Arg Val Leu Thr Val Lys 
235 240 245 

ACC TTC AAT GCC GGG CAA ATC TGT CTG GCA CCG GAC TAT GTG CTG GGG 1241 
Thr Phe Asn Ala Gly Gin lie Cys Leu Ala Pro Asp Tyr Val Leu 
250 255 260 262 

GAGAGGCGGT TTGCGTATTG GGCGCATGCA TAAAAACTGT TGTAATTCAT TAAGCATTCT 1301 

GCCGACATGG AAGCCATCAC AAACGGCATG ATGAAC CTGA ATCGCCAGCG GCATCAGCAC 13 61 

CTTGTCGCCT TGCGTATAAT ATTTGCCCAT GGACGCACAC CGTGGAAACG GATGAAGGCA 1421 

CGAACCCAGT TGACATAAGC CTGTTCGGTT CGTAAACTGT AATGCAAGTA GCGTATGCGC 14 81 

TCACGCAACT GGTCCAGAAC C TTG AC C GAA CGCAGCGGTG GTAACGGCGC AGTGGCGGTT 1541 

TTCATGGCTT GTTATGACTG TTTTTTTGTA CAGTCTATGC CTCGGGCATC CAAGCAGCAA 1601 

GCGCGTTACG CCGTGGGTCG ATGTTTGATG TTATGGAGCA GCAACG ATG TTA CGC 1656 

Met Leu Arg 
1 

AGC AGC AAC GAT GTT ACG CAG CAG GGC AGT CGC CCT AAA ACA AAG TTA 1704 
Ser Ser Asn Asp Val Thr Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu 
5 10 15 

GGT GGC TCA AGT ATG GGC ATC ATT CGC ACA TGT AGG CTC GGC CCT GAC 1752 
Gly Gly Ser Ser Met Gly lie lie Arg Thr Cys Arg Leu Gly Pro Asp 
20 25 30 35 

CAA GTC AAA TCC ATG CGG GCT GCT CTT GAT CTT TTC GGT CGT GAG TTC 1800 
Gin Val Lys Ser Met Arg Ala Ala Leu Asp Leu Phe Gly Arg Glu Phe 
40 45 50 

GGA GAC GTA GCC ACC TAC TCC CAA CAT CAG CCG GAC TCC GAT TAC CTC 1848 
Gly Asp Val Ala Thr Tyr Ser Gin His Gin Pro Asp Ser Asp Tyr Leu 
55 60 65 

GGG AAC TTG CTC CGT AGT AAG ACA TTC ATC GCG CTT GCT GCC TTC GAC 1896 
Gly Asn Leu Leu Arg Ser Lys Thr Phe lie Ala Leu Ala Ala Phe Asp 
70 75 80 

CAA GAA GCG GTT GTT GGC GCT CTC GCG GCT TAC GTT CTG CCC AGG TTT 1944 
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Gln Glu Ala Val Val 
85 

GAG CAG CCG CGT AGT 
Glu Gin Pro Arg Ser 
100 

GAG CAC CGG AGG CAG 
Glu His Arg Arg Gin 
120 

CAT GAG GCC AAC GCG 
His Glu Ala Asn Ala 
135 

TAC GGT GAC GAT CCC 
Tyr Gly Asp Asp Pro 
150 

GAA GAA GTG ATG CAC 
Glu Glu Val Met His 
165 

TTCGTTCAAG CCGAGATCGG CTTCCCTG CAA AGT CCT GTG GGT GAG TCG AAC 223 6 

Gin Ser Pro Val Gly Glu Ser Asn 
451 455 

TTG GCG ATG CGC GCA CCC TAC GGA GAA GCG ATC CAC GGA CTG CTC TCT 2284 
Leu Ala Met Arg Ala Pro Tyr Gly Glu Ala lie His Gly Leu Leu Ser 
460 465 470 

GTC CTC CTT TCA ACG GAG TGT TAG AACCGTTGGT AGTGGTTTTG GACGGGCCCA 2338 
Val Leu Leu Ser Thr Glu Cys 
475 480 481 

GGAGCATGCG CTTCTGGGCC CGTTTCTTGA GTATTCATTG GATAGTCACG CGTGGTAGCT 23 98 

TCGAGCCTGC ACAGCTGATG AGCACCCTGG AAGGCGCGCT GTACGCGGAC GACTGGGTTC 2458 

ATCTTCGCCA TTCATGACGG AACTCCGTTC CCCAGTACCG CGATGACTAT TTTGCCTCTT 2518 

CCGATGTCCG ATTCCACGCC GCCTGACGCT AAGCGGGGGC GGGGGCGCCC GCATCCCAGC 2578 

CCAGACAGCA ACAAATGAGT AGGCTCTTGG ATGCCGCGGC GGCTGAGATT GGTAAC GGC A 2 638 

ATTTCGTCAA TGTGACGATG GATTCGATTG CCCGTGCTGC CGGCGTCTCA AAAAAAACGC 2 698 

TGTACGTCTT GGTGGCGAGC AAGGAAGAAC TCATTTCCCG GTTAGTGGCT CGAGACATGT 2758 

CCAACCTTGA GGAATTC 2775 



Gly Ala Leu Ala Ala Tyr Val Leu Pro Arg Phe 
90 95 

GAG ATC TAT ATC TAT GAT CTC GCA GTC TCC GGC 1992 
Glu lie Tyr lie Tyr Asp Leu Ala Val Ser Gly 
105 110 115 

GGC ATT GCC ACC GCG CTC ATC AAT CTC CTC AAG 2 040 

Gly lie Ala Thr Ala Leu lie Asn Leu Leu Lys 
125 130 

CTT GGT GCT TAT GTG ATC TAC GTG CAA GCA GAT 2 088 

Leu Gly Ala Tyr Val lie Tyr Val Gin Ala Asp 
140 145 

GCA GTG GCT CTC TAT ACA AAG TTG GGC ATA CGG 213 6 

Ala Val Ala Leu Tyr Thr Lys Leu Gly lie Arg 
155 160 

TTT GAT ATC GAC CCA AGT ACC GCC ACC TAA CAA 2184 
Phe Asp lie Asp Pro Ser Thr Ala Thr 
170 175 177 



FIG . 2e: 
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GAATTCCGCG TATCGCCCGG TTCTATCAGC GGGCCGCTTT CGAAAGTCAT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TCGTTGACAT AGGGCAGAGG 120 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 180 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGCGGCG TCGAAGCGAT GCTCCACTAC 3 00 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 360 

CTCCAGCTCA AGGGCAATTT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 42 0 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CG ATG AGC ATT CTT GGT TTG AAT 473 

Met Ser lie Leu Gly Leu Asn 

1 5 

GGT GCC CCG GTC GGA GCT GAG CAG CTG GGC TCG GCT CTT GAT CGC ATG 521 

Gly Ala Pro Val Gly Ala Glu Gin Leu Gly Ser Ala Leu Asp Arg Met 

10 15 20 

AAG AAG GCG CAC CTG GAG CAG GGG CCT GCA AAC TTG GAG CTG CGT CTG 569 
Lys Lys Ala His Leu Glu Gin Gly Pro Ala Asn Leu Glu Leu Arg Leu 
25 30 35 

AGT AGG CTG GAT CGT GCG ATT GCA ATG CTT CTG GAA AAT CGT GAA GCA 617 
Ser Arg Leu Asp Arg Ala lie Ala Met Leu Leu Glu Asn Arg Glu Ala 
40 45 50 55 

ATT GCC GAC GCG GTT TCT GCT GAC TTT GGC AAT CGC AGC CGT GAG CAA 665 
lie Ala Asp Ala Val Ser Ala Asp Phe Gly Asn Arg Ser Arg Glu Gin 
60 65 70 

ACA CTG CTT TGC GAC ATT GCT GGC TCG GTG GCA AGC CTG AAG GAT AGC 713 
Thr Leu Leu Cys Asp lie Ala Gly Ser Val Ala Ser Leu Lys Asp Ser 
75 80 85 

CGC GAG CAC GTG GCC AAA TGG ATG GAG CCC GAA CAT CAC AAG GCG ATG 761 
Arg Glu His Val Ala Lys Trp Met Glu Pro Glu His His Lys Ala Met 
90 95 100 

TTT CCA GGG GCG GAG GCA CGC GTT GAG TTT CAG CCG CTG GGT GTC GTT 809 
Phe Pro Gly Ala Glu Ala Arg Val Glu Phe Gin Pro Leu Gly Val Val 
105 110 115 

GGG GTC ATT AGT CCC TGG AAC TTC CCT ATC GTA CTG GCC TTT GGG CCG 857 
Gly Val lie Ser Pro Trp Asn Phe Pro lie Val Leu Ala Phe Gly Pro 
120 125 130 * 135 

CTG GCC GGC ATA TTC GCA GCA GGT AAT CGC GCC ATG CTC AAG CCG TCC 905 
Leu Ala Gly lie Phe Ala Ala Gly Asn Arg Ala Met Leu Lys Pro Ser 
140 145 150 

GAG CTT ACC CCG CGG ACT TCT GCC CTG CTT GCG GAG CTA ATT GCT CGT 953 
Glu Leu Thr Pro Arg Thr Ser Ala Leu Leu Ala Glu Leu lie Ala Arg 
155 160 165 



- 15» 



TAC TTC GAT GAA ACT GAG CTG ACT ACA GTG CTG GGC GAC GCT GAA GTC 1001 
Tyr Phe Asp Glu Thr Glu Leu Thr Thr Val Leu Gly Asp Ala Glu Val 
170 175 180 

GGT GCG CTG TTC AGT GCT CAG CCT TTC GAT CAT CTG ATC TTC ACC GGC 1049 
Gly Ala Leu Phe Ser Ala Gin Pro Phe Asp His Leu lie Phe Thr Gly 
185 190 195 

GGC ACT GCC GTG GCC AAG CAC ATC ATG CGT GCC GCG GCG GAT AAC CTA 1097 
Gly Thr Ala Val Ala Lys His lie Met Arg Ala Ala Ala Asp Asn Leu 
200 205 210 215 

GTG CCC GTT ACC CTG GAA TTG GGT GGC AAA TCG CCG GTG ATC GTT TCC 1145 
Val Pro Val Thr Leu Glu Leu Gly Gly Lys Ser Pro Val lie Val Ser 
220 225 230 

CGC AGT GCA GAT ATG GCG GAC GTT GCA CAA CGG GTG TTG ACG GTG AAA 1193 
Arg Ser Ala Asp Met Ala Asp Val Ala Gin Arg Val Leu Thr Val Lys 
235 240 245 

ACC TTC AAT GCC GGG CAA ATC TGT CTG GCA CC GTG GGT GAG TCG AAC 1240 
Thr Phe Asn Ala Gly Gin lie Cys Leu Ala Val Gly Glu Ser Asn 
250 255 257 454 455 

TTG GCG ATG CGC GCA CCC TAC GGA GAA GCG ATC CAC GGA CTG CTC TCT 1288 
Leu Ala Met Arg Ala Pro Tyr Gly Glu Ala lie His Gly Leu Leu Ser 
460 465 470 

GTC CTC CTT TCA ACG GAG TGT TAG AACCGTTGGT AGTGGTTTTG GACGGGCCCA 1342 
Val Leu Leu Ser Thr Glu Cys 
475 480 481 

GGAGCATGCG CTTCTGGGCC CGTTTCTTGA GTATTCATTG GATAGTCACG CGTGGTAGCT 1402 

TCGAGCCTGC ACAGCTGATG AGCACCCTGG AAGGCGCGCT GTACGCGGAC GACTGGGTTC 1462 

ATCTTCGCCA TTCATGACGG AACTCCGTTC CCCAGTACCG CGATGACTAT TTTGCCTCTT 1522 

CCGATGTCCG ATTCCACGCC GCCTGACGCT AAGCGGGGGC GGGGGCGCCC GCATCCCAGC 1582 

CCAGACAGCA ACAAATGAGT AGGCTCTTGG ATGCCGCGGC GGCTGAGATT GGTAACGGCA 1642 

ATTTCGTCAA TGTGACGATG GATTCGATTG CCCGTGCTGC CGGCGTCTCA AAAAAAACGC 1702 

TGTACGTCTT GGTGGCGAGC AAGGAAGAAC TCATTTCCCG GTTAGTGGCT CGAGACATGT 1762 

CCAACCTTGA GGAATTC 1779 



FIG. 2f: 
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CTGCAGCCGA GCATCGATTG AGCACTTTAC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 120 

CTCGCCTCAT TTCAATCTCT AACTTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 180 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGCCGAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 3 00 

CTCCGGTTGA GGTTACGCAA GACGCTGGAG GTATTGTCCG G ATG CGT TCT CTC GAG 3 56 

Met Arg Ser Leu Glu 
1 5 

GCG CTT CTT CCC TTC CCG GGT CGA ATT CTT GAG CGT CTC GAG CAT TGG 4 04 
Ala Leu Leu Pro Phe Pro Gly Arg lie Leu Glu Arg Leu Glu His Trp 
10 15 20 

GCT AAG ACC CGT CCA GAA CAA ACC TGC GTT GCT GCC AGG GCG GCA AAT 452 
Ala Lys Thr Arg Pro Glu Gin Thr Cys Val Ala Ala Arg Ala Ala Asn 
25 30 35 

GGG GAA TGG CGT CGT ATC AGC TAC GCG GAA ATG TTC CAC AAC GTC CGC 500 
Gly Glu Trp Arg Arg lie Ser Tyr Ala Glu Met Phe His Asn Val Arg 
40 45 50 

GCC ATC GCA CAG AGC TTG CTT CCT TAC GGA CTA TCG GCA GAG CGT CCG 548 
Ala lie Ala Gin Ser Leu Leu Pro Tyr Gly Leu Ser Ala Glu Arg Pro 
55 60 65 

CTG CTT ATC GTC TCT GGA AAT GAC CTG GAA CAT CTT CAG CTG GCA TTT 596 
Leu Leu lie Val Ser Gly Asn Asp Leu Glu His Leu Gin Leu Ala Phe - 
70 75 80 85 

GGG GCT ATG TAT GCG GGC ATT CCC TAT TGC CCG GTG TCT CCT GCT TAT 644 
Gly Ala Met Tyr Ala Gly lie Pro Tyr Cys Pro Val Ser Pro Ala Tyr 
90 95 100 

TCA CTG CTG TCG CAA GAT TTG GCG AAG CTG CGT CAC ATC GTA GGT CTT 692 
Ser Leu Leu Ser Gin Asp Leu Ala Lys Leu Arg His lie Val Gly Leu 
105 110 115 

CTG CAA CCG GGA CTG GTC TTT GCT GCC GAT GCA GCA CCT TTC CAG GGG 740 
Leu Gin Pro Gly Leu Val Phe Ala Ala Asp Ala Ala Pro Phe Gin 
120 125 130 132 

ACAGCAAGCG AACCGGAATT GCCAGCTGGG GCGCCCTCTG GTAAGGTTGG GAAGCCCTGC 800 

AAAGTAAACT GGATGGCTTT CTTGCCGCCA AGGATCTGAT GGCGCAGGGG ATCAAGATCT 860 

GATCAAGAGA CAGGATGAGG ATCGTTTCGC ATG ATT GAA CAA GAT GGA TTG CAC 914 

Met lie Glu Gin Asp Gly Leu His 
1 5 

GCA GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC GGC TAT GAC TGG 962 
Ala Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp 
10 15 20 
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GCA CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG TTC CGG CTG TCA 1010 

Ala Gin Gin Thr lie Gly Cys Ser Asp Ala Ala Val Phe Arg Leu Ser 

25 30 35 40 

GCG CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC CTG TCC GGT GCC 1058 

Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala 

45 50 55 

CTG AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG TGG CTG GCC ACG 1106 

Leu Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr 

60 65 70 

ACG GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC ACT GAA GCG GGA 1154 

Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala Gly 

75 80 85 

AGG GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG GAT CTC CTG TCA 1202 

Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin Asp Leu Leu Ser 
90 95 100 

TCT CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG GCT GAT GCA ATG 125 0 

Ser His Leu Ala Pro Ala Glu Lys Val Ser lie Met Ala Asp Ala Met 

105 110 115 120 

CGG CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA TTC GAC CAC CAA 1298 

Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin 

125 130 135 

GCG AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG GAA GCC GGT CTT 134 6 

Ala Lys His Arg lie Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu 

140 145 150 

GTC GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG CTC GCG CCA GCC 1394 

Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro Ala 

155 160 165 

GAA CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC GGC GAG GAT CTC 1442 

Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu 
170 175 180 

GTC GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC ATG GTG GAA AAT 1490 

Val Val Thr His Gly Asp Ala Cys Leu Pro Asn lie Met Val Glu Asn 

185 190 195 200 

GGC CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG GGT GTG GCG GAC 153 8 

Gly Arg Phe Ser Gly Phe lie Asp Cys Gly Arg Leu Gly Val Ala Asp 

205 210 215 

CGC TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT GCT GAA GAG CTT 1586 

Arg Tyr Gin Asp lie Ala Leu Ala Thr Arg Asp lie Ala Glu Glu Leu 

220 225 230 

GGC GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC GGT ATC GCC GCT 1634 

Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly He Ala Ala 

235 240 245 
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CCC GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT GAC GAG TTC TTC 1682 
Pro Asp Ser Gin Arg lie Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe 
250 255 260 264 

TGAGCGGGAC TCTGGGGTTC GAAATGACCG ACCAAGCGAC GCCCCT GTT TTG CAA 1737 

Val Leu Gin 
563 565 

TGG CGG TCG GCG AAA GTT GAT GCG CTG TAT CGT GGT GAA GAT CAA TCC 1785 
Trp Arg Ser Ala Lys Val Asp Ala Leu Tyr Arg Gly Glu Asp Gin Ser 
570 575 580 

ATG CTG CGT GAC GAG GCC ACA CTG TGA GTTGGTCAGG GGGGGC TT AC 1832 
Met Leu Arg Asp Glu Ala Thr Leu 





585 


589 










TCGGCGTTTT 


CCGACACTGC 


GTTGGTTGCG 


GCAGTGCGCA 


CCCCCTGGAT 


TGATTGCGGG 


1892 


GGTGCCCTGT 


CGCTGGTGTC 


GCCTATCGAC 


TTAGGGGTAA 


AGGTCGCTCG 


CGAAGTTCTG 


1952 


ATGCGTGCGT 


CGCTTGAACC 


ACAAATGGTC 


GATAGCGTAC 


TCGCAGGCTC 


TATGGCTCAA 


2012 


GCAAGCTTTG 


ATGCTTACCT 


GCTCCCGCGG 


CACATTGGCT 


TGTACAGCGG 


TGTTCCCAAG 


2072 


TCGGTTCCGG 


CCTTGGGGGT 


GCAGCGCATT 


TGCGGCACAG 


GCTTCGAACT 


GCTTCGGCAG 


2132 


GCCGGCGAGC 


AGATTTCCCA 


AGGCGCTGAT 


CACGTGCTGT 


GTGTCGCGGG 


CTGCAG 


2188 



FIG. 2g: 
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CTGCAGCCGA GCATCGATTG AGCACTTTAC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 120 

CTCGCCTCAT TTCAATCTCT AAC TTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 180 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGC C GAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 3 00 

CTCCGGTTGA GGTTACGCAA GACGCTGGAG GTATTGTCCG G ATG CGT TCT CTC GAG 356 

Met Arg Ser Leu Glu 
1 5 

GCG CTT CTT CCC TTC CCG GGT CGA ATT CTT GAG CGT CTC GAG CAT TGG 404 
Ala Leu Leu Pro Phe Pro Gly Arg He Leu Glu Arg Leu Glu His Trp 
10 15 20 

GCT AAG ACC CGT CCA GAA CAA ACC TGC GTT GCT GCC AGG GCG GCA AAT 452 
Ala Lys Thr Arg Pro Glu Gin Thr Cys Val Ala Ala Arg Ala Ala Asn 
25 30 35 

GGG GAA TGG CGT CGT ATC AGC TAC GCG GAA ATG TTC CAC AAC GTC CGC 500 
Gly Glu Trp Arg Arg He Ser Tyr Ala Glu Met Phe His Asn Val Arg 
40 45 50 

GCC ATC GCA CAG AGC TTG CTT CCT TAC GGA CTA TCG GCA GAG CGT CCG 548 
Ala He Ala Gin Ser Leu Leu Pro Tyr Gly Leu Ser Ala Glu Arg Pro 
55 60 65 

CTG CTT ATC GTC TCT GGA AAT GAC CTG GAA CAT CTT CAG CTG GCA TTT 596 
Leu Leu He Val Ser Gly Asn Asp Leu Glu His Leu Gin Leu Ala Phe 
70 75 80 85 

GGG GCT ATG TAT GCG GGC ATT CCC TAT TGC CCG GTG TCT CCT GCT TAT 644 
Gly Ala Met Tyr Ala Gly He Pro Tyr Cys Pro Val Ser Pro Ala Tyr 
90 95 100 

TCA CTG CTG TCG CAA GAT TTG GCG AAG CTG CGT CAC ATC GTA GGT CTT 692 
Ser Leu Leu Ser Gin Asp Leu Ala Lys Leu Arg His He Val Gly Leu 
105 110 115 

CTG CAA CCG GGA CTG GTC TTT GCT GCC GAT GCA GCA CCT TTC CAG GGG 740 
Leu Gin Pro Gly Leu Val Phe Ala Ala Asp Ala Ala Pro Phe Gin 
120 125 130 132 

GAGAGGCGGT TTGCGTATTG GGCGCATGCA TAAAAACTGT TGTAATTCAT TAAGCATTCT 800 

GCCGACATGG AAGC C ATC AC AAACGGCATG ATGAACCTGA ATC GC C AGC G GCATCAGCAC 86 0 

CTTGTCGCCT TGCGTATAAT ATTTGCCCAT GGACGCACAC CGTGGAAACG GATGAAGGCA 920 

CGAACCCAGT TGACATAAGC CTGTTCGGTT CGTAAACTGT AATGCAAGTA GCGTATGCGC 980 

TCACGCAACT GGTCCAGAAC CTTGACCGAA CGCAGCGGTG GTAACGGCGC AGTGGC GGTT 1040 

TTCATGGCTT GTTATGACTG TTTTTTTGTA CAGTCTATGC CTCGGGCATC CAAGCAGCAA 1100 
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GCGCGTTACG CCGTGGGTCG ATGTTTGATG TTATGGAGCA GCAACG ATG TTA CGC 1155 

Met Leu Arg 
1 

AGC AGC AAC GAT GTT ACG CAG CAG GGC AGT CGC CCT AAA AC A AAG TTA 12 03 

Ser Ser Asn Asp Val Thr Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu 
5 10 15 

GGT GGC TCA AGT ATG GGC ATC ATT CGC ACA TGT AGG CTC GGC CCT GAC 12 51 

Gly Gly Ser Ser Met Gly lie lie Arg Thr Cys Arg Leu Gly Pro Asp 
20 25 30 35 

CAA GTC AAA TCC ATG CGG GCT GCT CTT GAT CTT TTC GGT CGT GAG TTC 1299 
Gin Val Lys Ser Met Arg Ala Ala Leu Asp Leu Phe Gly Arg Glu Phe 
40 45 50 

GGA GAC GTA GCC ACC TAC TCC CAA CAT CAG CCG GAC TCC GAT TAC CTC 1347 
Gly Asp Val Ala Thr Tyr Ser Gin His Gin Pro Asp Ser Asp Tyr Leu 
55 60 65 

GGG AAC TTG CTC CGT AGT AAG ACA TTC ATC GCG CTT GCT GCC TTC GAC 1395 
Gly Asn Leu Leu Arg Ser Lys Thr Phe lie Ala Leu Ala Ala Phe Asp 
70 75 80 

CAA GAA GCG GTT GTT GGC GCT CTC GCG GCT TAC GTT CTG CCC AGG TTT 1443 
Gin Glu Ala Val Val Gly Ala Leu Ala Ala Tyr Val Leu Pro Arg Phe 
85 90 95 

GAG CAG CCG CGT AGT GAG ATC TAT ATC TAT GAT CTC GCA GTC TCC GGC 1491 
Glu Gin Pro Arg Ser Glu lie Tyr lie Tyr Asp Leu Ala Val Ser Gly 
100 105 110 115 

GAG CAC CGG AGG CAG GGC ATT GCC ACC GCG CTC ATC AAT CTC CTC AAG 153 9 

Glu His Arg Arg Gin Gly lie Ala Thr Ala Leu lie Asn Leu Leu Lys 
120 125 130 

CAT GAG GCC AAC GCG CTT GGT GCT TAT GTG ATC TAC GTG CAA GCA GAT 1587 
His Glu Ala Asn Ala Leu Gly Ala Tyr Val lie Tyr Val Gin Ala Asp 
135 140 145 

TAC GGT GAC GAT CCC GCA GTG GCT CTC TAT ACA AAG TTG GGC ATA CGG 1635 
Tyr Gly Asp Asp Pro Ala Val Ala Leu Tyr Thr Lys Leu Gly lie Arg 
150 155 160 

GAA GAA GTG ATG CAC TTT GAT ATC GAC CCA AGT ACC GCC ACC TAA CAA 1683 
Glu Glu Val Met His Phe Asp lie Asp Pro Ser Thr Ala Thr 
165 170 175 177 

TTCGTTCAAG CCGAGATCGG CTTCCCCT GTT TTG CAA TGG CGG TCG'GCG AAA 1735 

Val Leu Gin Trp Arg Ser Ala Lys 
563 565 570 

GTT GAT GCG CTG TAT CGT GGT GAA GAT CAA TCC ATG CTG CGT GAC GAG 1783 
Val Asp Ala Leu Tyr Arg Gly Glu Asp Gin Ser Met Leu Arg Asp Glu 
575 580 585 
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GCC ACA CTG TGA GTTGGTCAGG GGGGGCTTAC TCGGCGTTTT CCGACACTGC 183 5 
Ala Thr Leu 
589 

GTTGGTTGCG GCAGTGCGCA CCCCCTGGAT TGATTGCGGG GGTGCCCTGT CGCTGGTGTC 1895 

GCCTATCGAC TTAGGGGTAA AGGTCGCTCG CGAAGTTCTG ATGCGTGCGT CGCTTGAACC 1955 

ACAAATGGTC GATAGCGTAC TCGCAGGCTC TATGGCTCAA GCAAGCTTTG ATGCTTACCT 2 015 

GCTCCCGCGG CACATTGGCT TGTACAGCGG TGTTCCCAAG TCGGTTCCGG CCTTGGGGGT 2 075 

GCAGCGCATT TGCGGCACAG GCTTCGAACT GCTTCGGCAG GCCGGCGAGC AGATTTCCCA 213 5 

AGGCGCTGAT CACGTGCTGT GTGTCGCGGG CTGCAG 2171 
FIG. 2h: 
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CTGCAGCCGA GCATCGATTG AGC AC TTT AC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 120 

CTCGCCTCAT TTCAATCTCT AACTTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 180 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGCCGAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 300 

CTCCGGTTGA GGTTACGCAA GACGCTGGAG GTATTGTCCG G ATG CGT TCT CTC GAG 356 

Met Arg Ser Leu Glu 
1 5 

GCG CTT CTT CCC TTC CCG GGT CGA ATT CTT GAG CGT CTC GAG CAT TGG 404 
Ala Leu Leu Pro Phe Pro Gly Arg lie Leu Glu Arg Leu Glu His Trp 
10 15 20 

GCT AAG ACC CGT CCA GAA CAA ACC TGC GTT GCT GCC AGG GCG GCA AAT 4 52 

Ala Lys Thr Arg Pro Glu Gin Thr Cys Val Ala Ala Arg Ala Ala Asn 
25 30 35 

GGG GAA TGG CGT CGT ATC AGC TAC GCG GAA ATG TTC CAC AAC GTC CGC 500 
Gly Glu Trp Arg Arg lie Ser Tyr Ala Glu Met Phe His Asn Val Arg 
40 45 50 

GCC ATC GCA CAG AGC TTG CTT CCT TAC GGA CTA TCG GCA GAG CGT CCG 548 
Ala lie Ala Gin Ser Leu Leu Pro Tyr Gly Leu Ser Ala Glu Arg Pro 
55 60 65 

CTG CTT ATC GTC TCT GGA AAT GAC CTG GAA CAT CTT CAG CTG GCA TTT 596 
Leu Leu lie Val Ser Gly Asn Asp Leu Glu His Leu Gin Leu Ala Phe - 
70 75 80 85 

GGG GCT ATG TAT GCG GGC ATT CCC TAT TGC CCG GTG TCT CCT GCT TAT 644 
Gly Ala Met Tyr Ala Gly lie Pro Tyr Cys Pro Val Ser Pro Ala Tyr 
90 95 100 

TCA CTG CTG TCG CAA GAT TTG GCG AAG CTG CGT CAC ATC GTA GGT CTT 692 
Ser Leu Leu Ser Gin Asp Leu Ala Lys Leu Arg His lie Val Gly Leu 
105 110 115 

CTG CAA CCG GGA CTG GTC TTT GCT GCC GAT GCA GCA CCT TTC CAG CGC 740 
Leu Gin Pro Gly Leu Val Phe Ala Ala Asp Ala Ala Pro Phe Gin Arg 
120 125 130 133 

GCT GTT TTG CAA TGG CGG TCG GCG AAA GTT GAT GCG CTG TAT CGT GGT 788 

Ala Val Leu Gin Trp Arg Ser Ala Lys Val Asp Ala Leu Tyr Arg Gly 
562 565 570 575 

GAA GAT CAA TCC ATG CTG CGT GAC GAG GCC ACA CTG TGA GTTGGTCAGG 837 
Glu Asp Gin Ser Met Leu Arg Asp Glu Ala Thr Leu 
580 585 589 

GGGGGCTTAC TCGGCGTTTT CCGACACTGC GTTGGTTGCG GCAGTGCGCA CCCCCTGGAT 897 

TGATTGCGGG GGTGCCCTGT CGCTGGTGTC GCCTATCGAC TTAGGGGTAA AGGTCGCTCG 957 
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CGAAGTTCTG ATGCGTGCGT CGCTTGAACC ACAAATGGTC GATAGCGTAC TCGCAGGCTC 1017 

TATGGCTCAA GCAAGCTTTG ATGCTTACCT GCTCCCGC GG CACATTGGCT TGTACAGCGG 1077 

TGTTCCCAAG TCGGTTCCGG CCTTGGGGGT GCAGCGCATT TGCGGCACAG GCTTCGAACT 1137 

GCTTCGGCAG GCCGGCGAGC AGATTTCCCA AGGCGCTGAT CACGTGCTGT GTGTCGCGGG 1197 

CTGCAG 1203 
FIG. 2i: 
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GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAAC TGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 120 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACG ATG AAT AGC TAC GAT GGC CGT TGG TCT ACC GTT GAT GTG AAG 290 
Met Asn Ser Tyr Asp Gly Arg Trp Ser Thr Val Asp Val Lys 
15 10 

GTT GAA GAA GGT ATC GCT TGG GTC ACG CTG AAC CGC CCG GAG AAG CGC 33 8 

Val Glu Glu Gly lie Ala Trp Val Thr Leu Asn Arg Pro Glu Lys Arg 
15 20 25 30 

AAC GCA ATG AGC CCA ACT CTC AAT CGA GAG ATG GTC GAG GTT CTG GAG 386 
Asn Ala Met Ser Pro Thr Leu Asn Arg Glu Met Val Glu Val Leu Glu 
35 40 45 

GTG CTG GAG CAG GAC GCA GAT GCT CGC GTG CTT GTT CTG ACT GGT GCA 434 
Val Leu Glu Gin Asp Ala Asp Ala Arg Val Leu Val Leu Thr Gly Ala 
50 55 60 

GGC GAA TCC TGG ACC GCG GGC ATG GAC CTG AAG GAG TAT TTC CGC GAG 482 
Gly Glu Ser Trp Thr Ala Gly Met Asp Leu Lys Glu Tyr Phe Arg Glu 
65 70 75 

ACC GAT GCT GGC CCC GAA ATT CTG CAA GAG AAG ATT CGT CGGGGACAGC 531 
Thr Asp Ala Gly Pro Glu lie Leu Gin Glu Lys lie Arg 
80 85 90 91 

AAGCGAACCG GAATTGCCAG CTGGGGCGCC CTCTGGTAAG GTTGGGAAGC CCTGCAAAGT 591 

AAACTGGATG GCTTTCTTGC CGCCAAGGAT CTGATGGCGC AGGGGATCAA GATCTGATCA 651 

AGAGACAGGA TGAGGATCGT TTCGC ATG ATT GAA CAA GAT GGA TTG CAC GCA 7 03 

Met He Glu Gin Asp Gly Leu His Ala 
1 5 

GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC GGC TAT GAC TGG GCA 751 
Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp Ala 
10 15 20 25 

CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG TTC CGG CTG TCA GCG 7 99 

Gin Gin Thr He Gly Cys Ser Asp Ala Ala Val Phe Arg Leu Ser Ala 
30 35 40 

CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC CTG TCC GGT GCC CTG 847 
Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala Leu 
45 50 55 

AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG TGG CTG GCC ACG ACG 895 
Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr Thr 
60 65 70 
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GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC ACT GAA GCG GGA AGG 943 
Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala Gly Arg 
75 80 85 

GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG GAT CTC CTG TCA TCT 991 
Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin Asp Leu Leu Ser Ser 
90 95 100 105 

CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG GCT GAT GCA ATG CGG 103 9 

His Leu Ala Pro Ala Glu Lys Val Ser lie Met Ala Asp Ala Met Arg 
110 115 120 

CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA TTC GAC CAC CAA GCG 1087 
Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin Ala 
125 130 135 

AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG GAA GCC GGT CTT GTC 1135 
Lys His Arg lie Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val 
140 145 150 

GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG CTC GCG CCA GCC GAA 1183 
Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro Ala Glu 
155 160 165 

CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC GGC GAG GAT CTC GTC 1231 
Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu Val 
170 175 180 185 

GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC ATG GTG GAA AAT GGC 1279 
Val Thr His Gly Asp Ala Cys Leu Pro Asn lie Met Val Glu Asn Gly 
190 195 200 

CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG GGT GTG GCG GAC CGC 132 7 

Arg Phe Ser Gly Phe lie Asp Cys Gly Arg Leu Gly Val Ala Asp Arg 
205 210 215 

TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT GCT GAA GAG CTT GGC 1375 
Tyr Gin Asp lie Ala Leu Ala Thr Arg Asp He Ala Glu Glu Leu Gly 
220 225 230 

GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC GGT ATC GCC GCT CCC 1423 
Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly lie Ala Ala Pro 
235 240 245 

GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT GAC GAG TTC TTC TGA 1471 
Asp Ser Gin Arg He Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe 
250 255 260 264 

GCGGGACTCT GGGGTTC GAA ATGACCGACC AAGCGACGCC CC GAG CAG GGC ATG 152 5 

Glu Gin Gly Met 
255 

AAG CAG TTC CTT GAC GAG AAA AGC ATC AAG CCG GGC TTG CAG ACC TAC 1573 
Lys Gin Phe Leu Asp Glu Lys Ser He Lys Pro Gly Leu Gin Thr Tyr 
260 265 270 
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AAG CGC TGA TAAATGCGCC GGGGCCCTCG CTGCGCCCCC GGCCTTCCAA TAATGACAAT 1632 
Lys Arg 
275 276 

AATGAGGAGT GCCCAATGTT TCACGTGCCC CTGCTTATTG GTGGTAAGCC TTGTTCAGCA 1692 

TCTGATGAGC GCACCTTCGA GCGTCGTAGC CCGCTGACCG GAGAAGTGGT ATCGCGCGTC 1752 

GCTGCTGCCA GTTTGGAAGA TGCGGACGCC GCAGTGGCCG CTGCACAGGC TGCGTTTCCT 1812 

GAATGGGCGG CGCTTGCTCC GAGCGAACGC CGTGCCCGAC TGCTGCGAGC GGCGGATCTT 1872 

CTAGAGGACC GTTCTTCCGA GTTCACCGCC GCAGCGAGTG AAACTGGCGC AGC GGGAAAC 1932 

TGGTATGGGT TTAACGTTTA CCTGGCGGCG GGCATGTTGC GGGGAATTC 1981 
FIG. 2j: 
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GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAAC TGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 120 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACG ATG AAT AGC TAC GAT GGC CGT TGG TCT ACC GTT GAT GTG AAG 290 
Met Asn Ser Tyr Asp Gly Arg Trp Ser Thr Val Asp Val Lys 
15 10 

GTT GAA GAA GGT ATC GCT TGG GTC ACG CTG AAC CGC CCG GAG AAG CGC 338 
Val Glu Glu Gly lie Ala Trp Val Thr Leu Asn Arg Pro Glu Lys Arg 
15 20 25 30 

AAC GCA ATG AGC CCA ACT CTC AAT CGA GAG ATG GTC GAG GTT CTG GAG 3 86 

Asn Ala Met Ser Pro Thr Leu Asn Arg Glu Met Val Glu Val Leu Glu 
35 40 45 

GTG CTG GAG CAG GAC GCA GAT GCT CGC GTG CTT GTT CTG ACT GGT GCA 434 
Val Leu Glu Gin Asp Ala Asp Ala Arg Val Leu Val Leu Thr Gly Ala 
50 55 60 

GGC GAA TCC TGG ACC GCG GGC ATG GAC CTG AAG GAG TAT TTC CGC GAG 482 
Gly Glu Ser Trp Thr Ala Gly Met Asp Leu Lys Glu Tyr Phe Arg Glu 
65 70 75 

ACC GAT GCT GGC CCC GAA ATT CTG CAA GAG AAG ATT CGT CGGGGGAGAG 531 
Thr Asp Ala Gly Pro Glu lie Leu Gin Glu Lys lie Arg 
80 85 90 91 

GCGGTTTGCG TATTGGGCGC ATGCATAAAA ACTGTTGTAA TTCATTAAGC ATTCTGCCGA 591 

CATGGAAGCC ATC ACAAAC G GCATGATGAA CCTGAATCGC CAGCGGCATC AGCACCTTGT 651 

CGCCTTGCGT ATAATATTTG CCCATGGACG CACACCGTGG AAAC GGATG A AGGCACGAAC 711 

CCAGTTGACA TAAGCCTGTT CGGTTCGTAA ACTGTAATGC AAGTAGCGTA TGCGCTCACG 771 

CAACTGGTCC AGAACCTTGA CCGAACGCAG CGGTGGTAAC GGCGCAGTGG CGGTTTTCAT 831 

GGCTTGTTAT GACTGTTTTT TTGTACAGTC TATGCCTCGG GCATCCAAGC AGCAAGCGCG 891 

TTACGCCGTG GGTCGATGTT TGATGTTATG GAGCAGCAAC G ATG TTA CGC AGC AGC 947 

Met Leu Arg Ser Ser 
1 5 

AAC GAT GTT ACG CAG CAG GGC AGT CGC CCT AAA ACA AAG TTA GGT GGC 995 
Asn Asp Val Thr Gin Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly 
10 15 20 

TCA AGT ATG GGC ATC ATT CGC ACA TGT AGG CTC GGC CCT GAC CAA GTC 1043 
Ser Ser Met Gly lie lie Arg Thr Cys Arg Leu Gly Pro Asp Gin Val 
25 30 35 
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AAA TCC ATG CGG GCT GCT CTT GAT CTT TTC GGT CGT GAG TTC GGA GAC 1091 
Lys Ser Met Arg Ala Ala Leu Asp Leu Phe Gly Arg Glu Phe Gly Asp 
40 45 50 

GTA GCC ACC TAC TCC CAA CAT CAG CCG GAC TCC GAT TAC CTC GGG AAC 1139 
Val Ala Thr Tyr Ser Gin His Gin Pro Asp Ser Asp Tyr Leu Gly Asn 
55 60 65 

TTG CTC CGT AGT AAG ACA TTC ATC GCG CTT GCT GCC TTC GAC CAA GAA 1187 
Leu Leu Arg Ser Lys Thr Phe lie Ala Leu Ala Ala Phe Asp Gin Glu 
70 75 80 85 

GCG GTT GTT GGC GCT CTC GCG GCT TAC GTT CTG CCC AGG TTT GAG CAG 1235 
Ala Val Val Gly Ala Leu Ala Ala Tyr Val Leu Pro Arg Phe Glu Gin 
90 95 100 

CCG CGT AGT GAG ATC TAT ATC TAT GAT CTC GCA GTC TCC GGC GAG CAC 1283 
Pro Arg Ser Glu lie Tyr lie Tyr Asp Leu Ala Val Ser Gly Glu His 
105 110 115 

CGG AGG CAG GGC ATT GCC ACC GCG CTC ATC AAT CTC CTC AAG CAT GAG 1331 
Arg Arg Gin Gly lie Ala Thr Ala Leu lie Asn Leu Leu Lys His Glu 
120 125 130 

GCC AAC GCG CTT GGT GCT TAT GTG ATC TAC GTG CAA GCA GAT TAC GGT 1379 
Ala Asn Ala Leu Gly Ala Tyr Val lie Tyr Val Gin Ala Asp Tyr Gly 
135 140 145 

GAC GAT CCC GCA GTG GCT CTC TAT ACA AAG TTG GGC ATA CGG GAA GAA 1427 
Asp Asp Pro Ala Val Ala Leu Tyr Thr Lys Leu Gly lie Arg Glu Glu 
150 155 160 165 

GTG ATG CAC TTT GAT ATC GAC CCA AGT ACC GCC ACC TAA CAATTCGTTC 147 6 

Val Met His Phe Asp lie Asp Pro Ser Thr Ala Thr 
170 175 177 

AAGCCGAGAT CGGCTTCCCC GAG CAG GGC ATG AAG CAG TTC CTT GAC GAG 1526 

Glu Gin Gly Met Lys Gin Phe Leu Asp Glu 
255 260 

AAA AGC ATC AAG CCG GGC TTG CAG ACC TAC AAG CGC TGA TAAATGC GC C 1575 
Lys Ser lie Lys Pro Gly Leu Gin Thr Tyr Lys Arg 
265 270 275 276 

GGGGCCCTCG CTGCGCCCCC GGCCTTCCAA TAATGACAAT AATGAGGAGT GCCCAATGTT 1635 

TCACGTGCCC CTGCTTATTG GTGGTAAGCC TTG TTC AGC A TC T GAT GAG C GCACCTTCGA 1695 

GCGTCGTAGC CCGCTGACCG GAGAAGTGGT ATCGCGCGTC GCTGCTGCCA GTTTGGAAGA 1755 

TGCGGACGCC GCAGTGGCCG CTGCACAGGC TGCGTTTCCT GAATGGGCGG CGCTTGCTCC 1815 



GAGCGAACGC CGTGCCCGAC TGCTGCGAGC GGCGGATCTT CTAGAGGACC GTTCTTCCGA 187 5 
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GTTCACCGCC GCAGCGAGTG AAACTGGCGC AGCGGGAAAC TGGTATGGGT TTAACGTTTA 193 5 
CCTGGCGGCG GGCATGTTGC GGGGAATTC 1964 
FIG. 2k: 
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GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAAC TGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 120 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACG ATG AAT AGC TAC GAT GGC CGT TGG TCT ACC GTT GAT GTG AAG 290 
Met Asn Ser Tyr Asp Gly Arg Trp Ser Thr Val Asp Val Lys 
15 10 
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ACC TAC AAG CGC TGA TAAATGCGCC GGGGCCCTCG CTGCGCCCCC GGCCTTCCAA 633 
Thr Tyr Lys Arg 
275 276 



TAATGACAAT AATGAGGAGT GCCCAATGTT TCACGTGCCC CTGCTTATTG GTGGTAAGCC 6 93 

TTGTTCAGCA TCTGATGAGC GCACCTTCGA GCGTCGTAGC CCGCTGACCG GAGAAGTGGT 7 53 

ATCGCGCGTC GCTGCTGCCA GTTTGGAAGA TGCGGACGCC GCAGTGGCCG CTGCACAGGC 813 

TGCGTTTCCT GAATGGGCGG CGCTTGCTCC GAGCGAACGC CGTGCCCGAC TGCTGCGAGC 873 

GGCGGATCTT CTAGAGGACC GTTCTTCCGA GTTCACCGCC GCAGCGAGTG AAACTGGCGC 933 

AGCGGGAAAC TGGTATGGGT TTAACGTTTA CCTGGCGGCG GGCATGTTGC GGGGAATTC 992 



FIG. 21: 
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GAATTCCAAT AATGACAATA ATGAGGAGTG CCCA ATG TTT CAC GTG CCC CTG CTT 55 

Met Phe His Val Pro Leu Leu 
1 5 

ATT GGT GGT AAG CCT TGT TCA GCA TCT GAT GAG CGC ACC TTC GAG CGT 103 
lie Gly Gly Lys Pro Cys Ser Ala Ser Asp Glu Arg Thr Phe Glu Arg 
10 15 20 

CGT AGC CCG CTG ACC GGA GAA GTG GTA TCG CGC GTC GCT GCT GCC AGT 151 
Arg Ser Pro Leu Thr Gly Glu Val Val Ser Arg Val Ala Ala Ala Ser 
25 30 35 

TTG GAA GAT GCG GAC GCC GCA GTG GCC GCT GCA CAG GCT GCG TTT CCT 199 
Leu Glu Asp Ala Asp Ala Ala Val Ala Ala Ala Gin Ala Ala Phe Pro 
40 45 50 55 

GAA TGG GCG GCG CTT GCT CCG AGC GAA CGC CGT GCC CGA CTG CTG CGA 247 
Glu Trp Ala Ala Leu Ala Pro Ser Glu Arg Arg Ala Arg Leu Leu Arg 
60 65 70 

GCG GCG GAT CTT CTA GAG GAC CGT TCT TCC GAG TTC ACC GCC GCA GCG 2 95 

Ala Ala Asp Leu Leu Glu Asp Arg Ser Ser Glu Phe Thr Ala Ala Ala 
75 80 85 

AGT GAA ACT GGC GCA GCG GGA AAC TGG TAT GGG TTT AAC GTT TAC CTG 343 
Ser Glu Thr Gly Ala Ala Gly Asn Trp Tyr Gly Phe Asn Val Tyr Leu 
90 95 100 

GCG GCG GGC ATG TTG CGG GAA GCC GCG GCC ATG ACC ACA CAG ATT CAG 391 
Ala Ala Gly Met Leu Arg Glu Ala Ala Ala Met Thr Thr Gin lie Gin 
105 110 115 

GGC GAT GTC ATT CCG TCC AAT GTG CCC GGT AGC TTT GCC ATG GCG GTT 439 
Gly Asp Val lie Pro Ser Asn Val Pro Gly Ser Phe Ala Met Ala Val 
120 125 130 135 

CGA CAG CCA TGT GGC GTG GTG CTC GGT ATT GCG CCT TGG AAT GCT CCG 487 
Arg Gin Pro Cys Gly Val Val Leu Gly lie Ala Pro Trp Asn Ala Pro 
140 145 150 

GTA ATC CTT GGC GTA CGG GCT GTT GCG ATG CCG TTG GCA TGC GGC AAT 535 
Val He Leu Gly Val Arg Ala Val Ala Met Pro Leu Ala Cys Gly Asn 
155 160 165 

ACC GTG GTG TTG AAA AGC TCT GAG CTG AGT CCC TTT ACC CAT CGC CTG 583 
Thr Val Val Leu Lys Ser Ser Glu Leu Ser Pro Phe Thr His Arg Leu 
170 175 180 

ATT GGT CAG GTG TTG CAT GAT GCT GGT CTG GGG GAT GGC GTG GTG AAT 631 
He Gly Gin Val Leu His Asp Ala Gly Leu Gly Asp Gly Val Val Asn 
185 190 195 

GTC ATC AGC AAT GCC CCG CAA GAC GCT CCT GCG GTG GTG GAG CGA CTG 67 9 

Val He Ser Asn Ala Pro Gin Asp Ala Pro Ala Val Val Glu Arg Leu 
200 205 210 215 
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ATT GCA AAT CCT GCG GTA CGT CGA GTG AAC TTC ACC GGT TCG ACC CAC 727 
lie Ala Asn Pro Ala Val Arg Arg Val Asn Phe Thr Gly Ser Thr His 
220 225 230 

GTT GGA CGG ATC ATT GGT GAG CTG TCT GCG CGT CAT CTG AAG CCT GCT 775 
Val Gly Arg lie lie Gly Glu Leu Ser Ala Arg His Leu Lys Pro Ala 
235 240 245 

GTG CTG GAA TTA GGT GGT AAG GCT CCG TTC TTG GTC TTG GAC GAT GCC 823 
Val Leu Glu Leu Gly Gly Lys Ala Pro Phe Leu Val Leu Asp Asp Ala 
250 255 260 

GAC CTC GAT GCG GCG GTC GAA GCG GCG GCC TTT GGT GCC TAC TTC AAT 871 
Asp Leu Asp Ala Ala Val Glu Ala Ala Ala Phe Gly Ala Tyr Phe Asn 
265 270 275 

CAG GGT CAA ATC TGC ATG TCC ACT GAG CGT CTG ATT GTG ACA GCA GTC 919 
Gin Gly Gin He Cys Met Ser Thr Glu Arg Leu He Val Thr Ala Val 
280 285 290 295 

GCA GAC GCC TTT GTT GAA AAG CTG GCG AGG AAG GTC GCC ACA CTG CGT 967 
Ala Asp Ala Phe Val Glu Lys Leu Ala Arg Lys Val Ala Thr Leu Arg 
300 305 310 

GCT GGC GAT CCT AAT GAT CCG CAA TCG GTC TTG GGT TCG TTG ATT GAT 1015 
Ala Gly Asp Pro Asn Asp Pro Gin Ser Val Leu Gly Ser Leu He Asp 
315 320 325 

GCC AAT GCA GGT CAA CGC ATC CAG GTT CTG GTC GAT GAT GCG CTC GGG 1063 
Ala Asn Ala Gly Gin Arg He Gin Val Leu Val Asp Asp Ala Leu 
330 335 340 342 

GACAGCAAGC GAACCGGAAT TGCCAGCTGG GGCGCCCTCT GGTAAGGTTG GGAAGCCCTG 1123 

CAAAGTAAAC TGGATGGCTT TCTTGCCGCC AAGGATCTGA TGGCGCAGGG GATCAAGATC 1183 

TGATCAAGAG ACAGGATGAG GATCGTTTCG C ATG ATT GAA CAA GAT GGA TTG 123 5 

Met He Glu Gin Asp Gly Leu 
1 5 

CAC GCA GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC GGC TAT GAC 1283 
His Ala Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp 
10 15 20 

TGG GCA CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG TTC CGG CTG 1331 
Trp Ala Gin Gin Thr He Gly Cys Ser Asp Ala Ala Val Phe Arg Leu 
25 30 35 

TCA GCG CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC CTG TCC GGT 13 79 

Ser Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly 
40 45 50 55 

GCC CTG AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG TGG CTG GCC 1427 
Ala Leu Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala 
60 65 70 
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ACG ACG GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC ACT GAA GCG 1475 
Thr Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala 
75 80 85 

GGA AGG GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG GAT CTC CTG 1523 
Gly Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin Asp Leu Leu 
90 95 100 

TCA TCT CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG GCT GAT GCA 1571 
Ser Ser His Leu Ala Pro Ala Glu Lys Val Ser lie Met Ala Asp Ala 
105 HO 115 

ATG CGG CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA TTC GAC CAC 1619 
Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His 
120 125 130 135 

CAA GCG AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG GAA GCC GGT 1667 
Gin Ala Lys His Arg lie Glu Arg Ala Arg Thr Arg Met Glu Ala Gly 
140 145 150 

CTT GTC GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG CTC GCG CCA 1715 
Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro 
155 160 165 

GCC GAA CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC GGC GAG GAT 1763 
Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp 
170 175 180 

CTC GTC GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC ATG GTG GAA 1811 
Leu Val Val Thr His Gly Asp Ala Cys Leu Pro Asn lie Met Val Glu 
185 190 195 

AAT GGC CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG GGT GTG GCG 1859 
Asn Gly Arg Phe Ser Gly Phe He Asp Cys Gly Arg Leu Gly Val Ala 
200 205 210 215 

GAC CGC TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT GCT GAA GAG 1907 
Asp Arg Tyr Gin Asp He Ala Leu Ala Thr Arg Asp He Ala Glu Glu 
220 225 230 

CTT GGC GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC GGT ATC GCC 1955 
Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly He Ala 
235 240 245 

GCT CCC GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT GAC GAG TTC 2003 
Ala Pro Asp Ser Gin Arg He Ala Phe Tyr Arg Leu Leu Asp Glu Phe 
250 255 260 

TTC TGA GCGGGACTCT GGGGTTCGAA ATGAC CGACC AAGCGACGCC CG GCC CAG 2057 
Phe Ala Gin 

264 421 

CGC GTC GAT TCG GGC ATT TGC CAT ATC AAT GGA CCG ACT GTG CAT GAC 2105 
Arg Val Asp Ser Gly He Cys His He Asn Gly Pro Thr Val His Asp 
425 430 435 
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GAG GCT CAG ATG CCA TTC GGT GGG GTG AAG TCC AGC GGC TAC GGC AGC 2153 
Glu Ala Gin Met Pro Phe Gly Gly Val Lys Ser Ser Gly Tyr Gly Ser 
440 445 450 

TTC GGC AGT CGA GCA TCG ATT GAG CAC TTT ACC CAG CTG CGC TGG CTG 2201 
Phe Gly Ser Arg Ala Ser lie Glu His Phe Thr Gin Leu Arg Trp Leu 
455 460 465 470 

ACC ATT CAG AAT GGC CCG CGG CAC TAT CCA ATC TAA ATCGATCTTC 2247 
Thr lie Gin Asn Gly Pro Arg His Tyr Pro lie 
475 480 481 

GGGCGCCGCG GGCATCATGC CCGCGGCGCT CGCCTCATTT CAATCTCTAA CTTGATAAAA 2307 

ACAGAGCTGT TCTCCGGTCT TGGTGGATCA AGGCCAGTCG CGGAGAGTCT C G AAG AGG AG 2367 

AGTACAGTGA ACGCCGAGTC CACATTGCAA CCGCAGGCAT CATCATGCTC TGCTCAGCCA 2427 

CGCTACCGCA GTGTGTCGAT TGGTCATCCT CCGGTTGAGG TTACGCAAGA CGCTGGAGGT 2487 

ATTGTCCGGA TGCGTTCTCT CGAGGCGCTT CTTCCCTTCC CGGGTGGAAT TC 2 539 



FIG . 2m: 
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GAATTCCAAT AATGACAATA ATGAGGAGTG CCCA ATG TTT CAC GTG CCC CTG CTT 55 

Met Phe His Val Pro Leu Leu 
1 5 

ATT GGT GGT AAG CCT TGT TCA GCA TCT GAT GAG CGC ACC TTC GAG CGT 103 
lie Gly Gly Lys Pro Cys Ser Ala Ser Asp Glu Arg Thr Phe Glu Arg 
10 15 20 

CGT AGC CCG CTG ACC GGA GAA GTG GTA TCG CGC GTC GCT GCT GCC AGT 151 
Arg Ser Pro Leu Thr Gly Glu Val Val Ser Arg Val Ala Ala Ala Ser 
25 30 35 

TTG GAA GAT GCG GAC GCC GCA GTG GCC GCT GCA CAG GCT GCG TTT CCT 199 

Leu Glu Asp Ala Asp Ala Ala Val Ala Ala Ala Gin Ala Ala Phe Pro 
40 45 50 55 

GAA TGG GCG GCG CTT GCT CCG AGC GAA CGC CGT GCC CGA CTG CTG CGA 247 
Glu Trp Ala Ala Leu Ala Pro Ser Glu Arg Arg Ala Arg Leu Leu Arg 
60 65 70 

GCG GCG GAT CTT CTA GAG GAC CGT TCT TCC GAG TTC ACC GCC GCA GCG 295 
Ala Ala Asp Leu Leu Glu Asp Arg Ser Ser Glu Phe Thr Ala Ala Ala 
75 80 85 

AGT GAA ACT GGC GCA GCG GGA AAC TGG TAT GGG TTT AAC GTT TAC CTG 343 
Ser Glu Thr Gly Ala Ala Gly Asn Trp Tyr Gly Phe Asn Val Tyr Leu 
90 95 100 

GCG GCG GGC ATG TTG CGG GAA GCC GCG GCC ATG ACC AC A CAG ATT CAG 391 
Ala Ala Gly Met Leu Arg Glu Ala Ala Ala Met Thr Thr Gin lie Gin 
105 110 115 

GGC GAT GTC ATT CCG TCC AAT GTG CCC GGT AGC TTT GCC ATG GCG GTT 439 
Gly Asp Val lie Pro Ser Asn Val Pro Gly Ser Phe Ala Met Ala Val 
120 125 130 135 

CGA CAG CCA TGT GGC GTG GTG CTC GGT ATT GCG CCT TGG AAT GCT CCG 487 
Arg Gin Pro Cys Gly Val Val Leu Gly lie Ala Pro Trp Asn Ala Pro 
140 145 150 

GTA ATC CTT GGC GTA CGG GCT GTT GCG ATG CCG TTG GCA TGC GGC AAT 535 

Val lie Leu Gly Val Arg Ala Val Ala Met Pro Leu Ala Cys Gly Asn 
155 160 165 

ACC GTG GTG TTG AAA AGC TCT GAG CTG AGT CCC TTT ACC CAT CGC CTG 583 
Thr Val Val Leu Lys Ser Ser Glu Leu Ser Pro Phe Thr His Arg Leu 
170 175 180 

ATT GGT CAG GTG TTG CAT GAT GCT GGT CTG GGG GAT GGC GTG GTG AAT 631 
lie Gly Gin Val Leu His Asp Ala Gly Leu Gly Asp Gly Val Val Asn 
185 190 195 

GTC ATC AGC AAT GCC CCG CAA GAC GCT CCT GCG GTG GTG GAG CGA CTG 679 
Val lie Ser Asn Ala Pro Gin Asp Ala Pro Ala Val Val Glu Arg Leu 
200 205 210 215 
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ATT GCA AAT CCT GCG GTA CGT CGA GTG AAC TTC ACC GGT TCG ACC CAC 727 
lie Ala Asn Pro Ala Val Arg Arg Val Asn Phe Thr Gly Ser Thr His 
220 225 230 

GTT GGA CGG ATC ATT GGT GAG CTG TCT GCG CGT CAT CTG AAG CCT GCT 775 
Val Gly Arg lie lie Gly Glu Leu Ser Ala Arg His Leu Lys Pro Ala 
235 240 245 

GTG CTG GAA TTA GGT GGT AAG GCT CCG TTC TTG GTC TTG GAC GAT GCC 823 
Val Leu Glu Leu Gly Gly Lys Ala Pro Phe Leu Val Leu Asp Asp Ala 
250 255 260 

GAC CTC GAT GCG GCG GTC GAA GCG GCG GCC TTT GGT GCC TAC TTC AAT 871 
Asp Leu Asp Ala Ala Val Glu Ala Ala Ala Phe Gly Ala Tyr Phe Asn 
265 270 275 

CAG GGT CAA ATC TGC ATG TCC ACT GAG CGT CTG ATT GTG ACA GCA GTC 919 
Gin Gly Gin lie Cys Met Ser Thr Glu Arg Leu lie Val Thr Ala Val 
280 285 290 295 

GCA GAC GCC TTT GTT GAA AAG CTG GCG AGG AAG GTC GCC ACA CTG CGT 967 
Ala Asp Ala Phe Val Glu Lys Leu Ala Arg Lys Val Ala Thr Leu Arg 
300 305 310 

GCT GGC GAT CCT AAT GAT CCG CAA TCG GTC TTG GGT TCG TTG ATT GAT 1015 
Ala Gly Asp Pro Asn Asp Pro Gin Ser Val Leu Gly Ser Leu lie Asp 
315 320 325 

GCC AAT GCA GGT CAA CGC ATC CAG GTGGGGAGAG GCGGTTTGCG TATTGGGCGC 1069 
Ala Asn Ala Gly Gin Arg lie Gin 
330 335 

ATGCATAAAA ACTGTTGTAA TTCATTAAGC ATTCTGCCGA CATGGAAGCC ATCACAAACG 1129 

GCATGATGAA CCTGAATCGC CAGCGGCATC AGCACCTTGT CGCCTTGCGT ATAATATTTG 1189 

CCCATGGACG CACACCGTGG AAACGGATGA AGGCACGAAC CCAGTTGACA TAAGCCTGTT 1249 

CGGTTCGTAA ACTGTAATGC AAGTAGCGTA TGCGCTCACG CAACTGGTCC AGAACCTTGA 13 09 

CCGAACGCAG CGGTGGTAAC GGCGCAGTGG C GGTTTTC AT GGCTTGTTAT GACTGTTTTT 13 69 

TTGTACAGTC TATGCCTCGG GCATCCAAGC AGCAAGCGCG TTACGCCGTG GGTCGATGTT 1429 

TGATGTTATG GAGCAGCAAC G ATG TTA CGC AGC AGC AAC GAT GTT ACG CAG 1480 

Met Leu Arg Ser Ser Asn Asp Val Thr Gin 
15 10 

CAG GGC AGT CGC CCT AAA ACA AAG TTA GGT GGC TCA AGT ATG GGC ATC 1528 
Gin Gly Ser Arg Pro Lys Thr Lys Leu Gly Gly Ser Ser Met Gly lie 
15 20 25 

ATT CGC ACA TGT AGG CTC GGC CCT GAC CAA GTC AAA TCC ATG CGG GCT 1576 
lie Arg Thr Cys Arg Leu Gly Pro Asp Gin Val Lys Ser Met Arg Ala 
30 35 40 
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GCT CTT GAT CTT TTC GGT CGT GAG TTC GGA GAC GTA GCC ACC TAC TCC 1624 
Ala Leu Asp Leu Phe Gly Arg Glu Phe Gly Asp Val Ala Thr Tyr Ser 
45 50 55 

CAA CAT CAG CCG GAC TCC GAT TAC CTC GGG AAC TTG CTC CGT AGT AAG 1672 
Gin His Gin Pro Asp Ser Asp Tyr Leu Gly Asn Leu Leu Arg Ser Lys 
60 65 70 

AC A TTC ATC GCG CTT GCT GCC TTC GAC CAA GAA GCG GTT GTT GGC GCT 172 0 

Thr Phe lie Ala Leu Ala Ala Phe Asp Gin Glu Ala Val Val Gly Ala 
75 80 85 90 

CTC GCG GCT TAC GTT CTG CCC AGG TTT GAG CAG CCG CGT AGT GAG ATC 1768 
Leu Ala Ala Tyr Val Leu Pro Arg Phe Glu Gin Pro Arg Ser Glu lie 
95 100 105 

TAT ATC TAT GAT CTC GCA GTC TCC GGC GAG CAC CGG AGG CAG GGC ATT 1816 
Tyr lie Tyr Asp Leu Ala Val Ser Gly Glu His Arg Arg Gin Gly lie 
110 115 120 

GCC ACC GCG CTC ATC AAT CTC CTC AAG CAT GAG GCC AAC GCG CTT GGT 1864 
Ala Thr Ala Leu lie Asn Leu Leu Lys His Glu Ala Asn Ala Leu Gly 
125 130 135 

GCT TAT GTG ATC TAC GTG CAA GCA GAT TAC GGT GAC GAT CCC GCA GTG 1912 
Ala Tyr Val lie Tyr Val Gin Ala Asp Tyr Gly Asp Asp Pro Ala Val 
140 145 150 

GCT CTC TAT AC A AAG TTG GGC ATA CGG GAA GAA GTG ATG CAC TTT GAT 19 60 

Ala Leu Tyr Thr Lys Leu Gly lie Arg Glu Glu Val Met His Phe Asp 
155 160 165 170 

ATC GAC CCA AGT ACC GCC ACC TAA CAATTCGTTC AAGC C GAG AT CGGCTTCCCA 2014 
He Asp Pro Ser Thr Ala Thr 
175 177 

A TTG GCC CAG CGC GTC GAT TCG GGC ATT TGC CAT ATC AAT GGA CCG ACT 2063 
Leu Ala Gin Arg Val Asp Ser Gly He Cys His He Asn Gly Pro Thr 
420 425 430 435 

GTG CAT GAC GAG GCT CAG ATG CCA TTC GGT GGG GTG AAG TCC AGC GGC 2111 
Val His Asp Glu Ala Gin Met Pro Phe Gly Gly Val Lys Ser Ser Gly 
440 445 450 

TAC GGC AGC TTC GGC AGT CGA GCA TCG ATT GAG CAC TTT ACC CAG CTG 2159 
Tyr Gly Ser Phe Gly Ser Arg Ala Ser He Glu His Phe Thr Gin Leu 
455 460 465 

CGC TGG CTG ACC ATT CAG AAT GGC CCG CGG CAC TAT CCA ATC TAA 2204 
Arg Trp Leu Thr He Gin Asn Gly Pro Arg His Tyr Pro He 
470 475 480 481 

ATCGATCTTC GGGCGCCGCG GGCATCATGC CCGCGGCGCT CGCCTCATTT CAATCTCTAA 2264 

CTTGATAAAA ACAGAGCTGT TCTCCGGTCT TGGTGGATCA AGGCCAGTCG CGGAGAGTCT 2324 
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CGAAGAGGAG AGTACAGTGA ACGCCGAGTC 
TGCTCAGCCA CGCTACCGCA GTGTGTCGAT 
CGCTGGAGGT ATTGTCCGGA TGCGTTCTCT 
TC 

FIG. 2n: 



CACATTGCAA CCGCAGGCAT CATCATGCTC 
TGGTCATCCT CCGGTTGAGG TTACGCAAGA 
CGAGGCGCTT CTTCCCTTCC CGGGTGGAAT 
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GAATTCCAAT AATGACAATA ATGAGGAGTG CCCA ATG TTT CAC GTG CCC CTG CTT 55 

Met Phe His Val Pro Leu Leu 
1 5 

ATT GGT GGT AAG CCT TGT TCA GCA TCT GAT GAG CGC ACC TTC GAG CGT 103 
lie Gly Gly Lys Pro Cys Ser Ala Ser Asp Glu Arg Thr Phe Glu Arg 
10 15 20 

CGT AGC CCG CTG ACC GGA GAA GTG GTA TCG CGC GTC GCT GCT GCC AGT 151 
Arg Ser Pro Leu Thr Gly Glu Val Val Ser Arg Val Ala Ala Ala Ser 
25 30 35 

TTG GAA GAT GCG GAC GCC GCA GTG GCC GCT GCA CAG GCT GCG TTT CCT 199 
Leu Glu Asp Ala Asp Ala Ala Val Ala Ala Ala Gin Ala Ala Phe Pro 
40 45 50 55 

GAA TGG GCG GCG CTT GCT CCG AGC GAA CGC CGT GCC CGA CTG CTG CGA 247 
Glu Trp Ala Ala Leu Ala Pro Ser Glu Arg Arg Ala Arg Leu Leu Arg 
60 65 70 

GCG GCG GAT CTT CTA GAG GAC CGT TCT TCC GAG TTC ACC GCC GCA GCG 295 
Ala Ala Asp Leu Leu Glu Asp Arg Ser Ser Glu Phe Thr Ala Ala Ala 
75 80 85 

AGT GAA ACT GGC GCA GCG GGA AAC TGG TAT GGG TTT AAC GTT TAG CTG 343 
Ser Glu Thr Gly Ala Ala Gly Asn Trp Tyr Gly Phe Asn Val Tyr Leu 
90 95 100 

GCG GCG GGC ATG TTG CGG GAA GCC GCG GCC ATG ACC ACA CAG ATT CAG 3 91 

Ala Ala Gly Met Leu Arg Glu Ala Ala Ala Met Thr Thr Gin lie Gin 
105 110 115 

GGC GAT GTC ATT CCG TCC AAT GTG CCC GGT AGC TTT GCC ATG GCG GTT 439 
Gly Asp Val He Pro Ser Asn Val Pro Gly Ser Phe Ala Met Ala Val 
120 125 130 135 

CGA CAG CCA TGT GGC GTG GTG CTC GGT ATT GCG CCT TGG AAT GCT CCG 487 

Arg Gin Pro Cys Gly Val Val Leu Gly He Ala Pro Trp Asn Ala Pro 
140 145 150 

GTA ATC CTT GGC GTA CGG GCT GTT GCG ATG CCG TTG GCA TGC GGC AAT 53 5 

Val He Leu Gly Val Arg Ala Val Ala Met Pro Leu Ala Cys Gly Asn 
155 160 165 

ACC GTG GTG TTG AAA AGC TCT GAG CTG AGT CCC TTT ACC CAT CGC CTG 583 
Thr Val Val Leu Lys Ser Ser Glu Leu Ser Pro Phe Thr His Arg Leu 
170 175 180 

ATT GGT CAG GTG TTG CAT GAT GCT GGT CTG GGG GAT GGC GTG GTG AAT 631 
He Gly Gin Val Leu His Asp Ala Gly Leu Gly Asp Gly Val Val Asn 
185 190 195 

GTC ATC AGC AAT GCC CCG CAA GAC GCT CCT GCG GTG GTG GAG CGA CTG 679 
Val He Ser Asn Ala Pro Gin Asp Ala Pro Ala Val Val Glu Arg Leu 
200 205 210 215 
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ATT GCA AAT CCT GCG GTA CGT CGA GTG AAC TTC ACC GGT TCG ACC CAC 727 
lie Ala Asn Pro Ala Val Arg Arg Val Asn Phe Thr Gly Ser Thr His 
220 225 230 

GTT GGA CGG ATC ATT GGT GAG CTG TCT GCG CGT CAT CTG AAG CCT GCT 77 5 

Val Gly Arg lie lie Gly Glu Leu Ser Ala Arg His Leu Lys Pro Ala 
235 240 245 

GTG CTG GAA TTA GGT GGT AAG GCT CCG TTC TTG GTC TTG GAC GAT GCC 823 
Val Leu Glu Leu Gly Gly Lys Ala Pro Phe Leu Val Leu Asp Asp Ala 
250 255 260 

GAC CTC GAT GCG GCG GTC GAA GCG GCG GCC TTT GGT GCC TAC TTC AAT 871 
Asp Leu Asp Ala Ala Val Glu Ala Ala Ala Phe Gly Ala Tyr Phe Asn 
265 270 275 

CAG GGT CAA ATC TGC ATG TCC ACT GAG CGT CTG ATT GTG ACA GCA GTC 919 
Gin Gly Gin lie Cys Met Ser Thr Glu Arg Leu lie Val Thr Ala Val 
280 285 290 295 

GCA GAC GCC TTT GTT GAA AAG CTG GCG AGG AAG GTC GCC ACA CTG CGT 967 
Ala Asp Ala Phe Val Glu Lys Leu Ala Arg Lys Val Ala Thr Leu Arg 
300 305 310 

GCT GGC GAT CCT AAT GAT CCG CAA TCG GTC TTG GGT TCG TTG ATT GAT 1015 
Ala Gly Asp Pro Asn Asp Pro Gin Ser Val Leu Gly Ser Leu lie Asp 
315 320 325 

GCC AAT GCA GGT CAA CGC ATC CAG GTT CTG GTC GAT GAT GCG CTC GCA 1063 
Ala Asn Ala Gly Gin Arg lie Gin Val Leu Val Asp Asp Ala Leu Ala 
330 335 340 

AAA GGC GCG CAATGGAA TTG GCC CAG CGC GTC GAT TCG GGC ATT TGC CAT 1113 
Lys Gly Ala Leu Ala Gin Arg Val Asp Ser Gly He Cys His 

345 346 420 425 430 

ATC AAT GGA CCG ACT GTG CAT GAC GAG GCT CAG ATG CCA TTC GGT GGG 1161 
He Asn Gly Pro Thr Val His Asp Glu Ala Gin Met Pro Phe Gly Gly 
435 440 445 

GTG AAG TCC AGC GGC TAC GGC AGC TTC GGC AGT CGA GCA TCG ATT GAG 12 09 

Val Lys Ser Ser Gly Tyr Gly Ser Phe Gly Ser Arg Ala Ser He Glu 
450 455 460 

CAC TTT ACC CAG CTG CGC TGG CTG ACC ATT CAG AAT GGC CCG CGG CAC 1257 
His Phe Thr Gin Leu Arg Trp Leu Thr He Gin Asn Gly Pro Arg His 
465 470 475 

TAT CCA ATC TAA ATCGATCTTC GGGCGCCGCG GGCATCATGC CCGCGGCGCT 1309 
Tyr Pro He 
480 481 

CGCCTCATTT CAATCTCTAA CTTGATAAAA AC AGAGC TGT TCTCCGGTCT TGGTGGATCA 13 69 

AGGCCAGTCG CGGAGAGTCT CGAAGAGGAG AGTACAGTGA ACGCCGAGTC CACATTGCAA 1429 
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CCGCAGGCAT CATCATGCTC TGCTCAGCCA CGCTACCGCA GTGTGTCGAT TGGTCATCCT 1489 

CCGGTTGAGG TTACGCAAGA CGCTGGAGGT ATTGTCCGGA TGCGTTCTCT CGAGGCGCTT 1549 

CTTCCCTTCC CGGGTGGAAT TC 1571 
FIG. 2o: 
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GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACT GTG AGT TGG TCA GGG GGG GCT TAG TCG GCG TTT TCC 112 
Met Ser Trp Ser Gly Gly Ala Tyr Ser Ala Phe Ser 
15 10 

GAC ACT GCG TTG GTT GCG GCA GTG CGC ACC CCC TGG ATT GAT TGC GGG 160 
Asp Thr Ala Leu Val Ala Ala Val Arg Thr Pro Trp lie Asp Cys Gly 
15 20 25 

GGT GCC CTG TCG CTG GTG TCG CCT ATC GAC TTA GGG GTA AAG GTC GCT 208 
Gly Ala Leu Ser Leu Val Ser Pro lie Asp Leu Gly Val Lys Val Ala 
30 35 40 

CGC GAA GTT CTG ATG CGT GCG TCG CTT GAA CCA CAA ATG GTC GAT AGC 256 
Arg Glu Val Leu Met Arg Ala Ser Leu Glu Pro Gin Met Val Asp Ser 
45 50 55 60 

GTA CTC GCA GGC TCT ATG GCT CAA GCA AGC TTT GAT GCT TAC CTG CTC 3 04 

Val Leu Ala Gly Ser Met Ala Gin Ala Ser Phe Asp Ala Tyr Leu Leu 
65 70 75 

CCG CGG CAC ATT GGC TTG TAC AGC GGT GTT CCC AAG TCG GTT CCG GCC 3 52 

Pro Arg His lie Gly Leu Tyr Ser Gly Val Pro Lys Ser Val Pro Ala 
80 85 90 

TTG GGG GTG CAG CGC ATT TGC GGC ACA GGC TTC GAA CTG CTT CGG CAG 400 
Leu Gly Val Gin Arg lie Cys Gly Thr Gly Phe Glu Leu Leu Arg Gin 
95 100 105 

GCC GGC GAG CAG ATT TCC CAA GGC GCT GAT CAC GTG CTG TGT GTC GCG 448 
Ala Gly Glu Gin He Ser Gin Gly Ala Asp His Val Leu Cys Val Ala 
110 115 120 

GCA GAG TCC ATG TCG CGT AAC CCC ATC GCG TCG TAT ACA CAC CGG GGC 496 
Ala Glu Ser Met Ser Arg Asn Pro He Ala Ser Tyr Thr His Arg Gly 
125 130 135 140 

GGG TTC CGC CTC GGT GCG CCC GTT GAG TTC AAG GAT TTT TTG TGG GAG 544 
Gly Phe Arg Leu Gly Ala Pro Val Glu Phe Lys Asp Phe Leu Trp Glu 
145 150 155 

GCA TTG TTT GAT CCT GCT CCA GGA CTC GAC ATG ATC GCT ACC GCA GAA 592 
Ala Leu Phe Asp Pro Ala Pro Gly Leu Asp Met He Ala Thr Ala Glu 
160 165 170 

AAC CTG GGGACAGCAA GCGAACCGGA ATTGCCAGCT GGGGCGCCCT CTGGTAAGGT 648 
Asn Leu 
174 

TGGGAAGCCC TGCAAAGTAA ACTGGATGGC TTTCTTGCCG CCAAGGATCT GATGGCGCAG 708 

GGGATCAAGA TCTGATCAAG AGACAGGATG AGGATCGTTT CGC ATG ATT GAA CAA 763 

Met He Glu Gin 
1 
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GAT GGA TTG CAC GCA GGT TCT CCG GCC GCT TGG GTG GAG AGG CTA TTC 811 
Asp Gly Leu His Ala Gly Ser Pro Ala Ala Trp Val Glu Arg Leu Phe 
5 10 15 20 

GGC TAT GAC TGG GCA CAA CAG ACA ATC GGC TGC TCT GAT GCC GCC GTG 859 
Gly Tyr Asp Trp Ala Gin Gin Thr lie Gly Cys Ser Asp Ala Ala Val 
25 30 35 

TTC CGG CTG TCA GCG CAG GGG CGC CCG GTT CTT TTT GTC AAG ACC GAC 907 
Phe Arg Leu Ser Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp 
40 45 50 

CTG TCC GGT GCC CTG AAT GAA CTG CAG GAC GAG GCA GCG CGG CTA TCG 955 
Leu Ser Gly Ala Leu Asn Glu Leu Gin Asp Glu Ala Ala Arg Leu Ser 
55 60 65 

TGG CTG GCC ACG ACG GGC GTT CCT TGC GCA GCT GTG CTC GAC GTT GTC 1003 
Trp Leu Ala Thr Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val 
70 75 80 

ACT GAA GCG GGA AGG GAC TGG CTG CTA TTG GGC GAA GTG CCG GGG CAG 1051 
Thr Glu Ala Gly Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin 
85 90 95 100 

GAT CTC CTG TCA TCT CAC CTT GCT CCT GCC GAG AAA GTA TCC ATC ATG 1099 
Asp Leu Leu Ser Ser His Leu Ala Pro Ala Glu Lys Val Ser lie Met 
105 110 115 

GCT GAT GCA ATG CGG CGG CTG CAT ACG CTT GAT CCG GCT ACC TGC CCA 1147 
Ala Asp Ala Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro 
120 125 130 

TTC GAC CAC CAA GCG AAA CAT CGC ATC GAG CGA GCA CGT ACT CGG ATG 1195 
Phe Asp His Gin Ala Lys His Arg lie Glu Arg Ala Arg Thr Arg Met 
135 140 145 

GAA GCC GGT CTT GTC GAT CAG GAT GAT CTG GAC GAA GAG CAT CAG GGG 1243 
Glu Ala Gly Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly 
150 155 160 

CTC GCG CCA GCC GAA CTG TTC GCC AGG CTC AAG GCG CGC ATG CCC GAC 1291 
Leu Ala Pro Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp 
165 170 175 180 

GGC GAG GAT CTC GTC GTG ACC CAT GGC GAT GCC TGC TTG CCG AAT ATC 1339 
Gly Glu Asp Leu Val Val Thr His Gly Asp Ala Cys Leu Pro Asn lie 
185 190 195 

ATG GTG GAA AAT GGC CGC TTT TCT GGA TTC ATC GAC TGT GGC CGG CTG 1387 
Met Val Glu Asn Gly Arg Phe Ser Gly Phe lie Asp Cys Gly Arg Leu 
200 205 210 

GGT GTG GCG GAC CGC TAT CAG GAC ATA GCG TTG GCT ACC CGT GAT ATT 1435 
Gly Val Ala Asp Arg Tyr Gin Asp lie Ala Leu Ala Thr Arg Asp lie 
215 220 225 
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GCT GAA GAG CTT GGC GGC GAA TGG GCT GAC CGC TTC CTC GTG CTT TAC 1483 
Ala Glu Glu Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr 
230 235 240 

GGT ATC GCC GCT CCC GAT TCG CAG CGC ATC GCC TTC TAT CGC CTT CTT 1531 
Gly lie Ala Ala Pro Asp Ser Gin Arg lie Ala Phe Tyr Arg Leu Leu 
245 250 255 260 

GAC GAG TTC TTC TGA GCGGGACTCT GGGGTTCGAA ATGACCGACC AAGCGACGCC 1586 
Asp Glu Phe Phe 
264 

CA TTG AGG GCG CAA GAG GAG AAA TGG ATT GAC CAA GAG ATC GTG GCT 163 3 

Leu Arg Ala Gin Glu Glu Lys Trp lie Asp Gin Glu lie Val Ala 
197 200 205 210 

GTT ACG GAT GAA CAG TTC GAT TTA GAG GGC TAC AAC AGT CGA GCA ATT 1681 
Val Thr Asp Glu Gin Phe Asp Leu Glu Gly Tyr Asn Ser Arg Ala lie 
215 220 225 

GAA CTG CCT CGG AAG GCA AAA TTG TTG ATC GTG ACA GTC ATC CGC GGC 172 9 

Glu Leu Pro Arg Lys Ala Lys Leu Leu lie Val Thr Val lie Arg Gly 
230 235 240 

CTA GCA GTC TTT GAA GCC CTT TCC CGA TTG AAG CCT GTT CAT TCT GGC 1777 
Leu Ala Val Phe Glu Ala Leu Ser Arg Leu Lys Pro Val His Ser Gly 
245 250 255 

GGG GTG CAG ACT GCG GGC AAC AGC TGT GCC GTA GTG GAC GGC GCC GCG 1825 
Gly Val Gin Thr Ala Gly Asn Ser Cys Ala Val Val Asp Gly Ala Ala 
260 265 270 275 

GCG GCT TTG GTG GCT CGA GAG TCG TCT GCG ACA CAG CCG GTC TTG GCT 1873 
Ala Ala Leu Val Ala Arg Glu Ser Ser Ala Thr Gin Pro Val Leu Ala 
280 285 290 

AGG ATA CTG GCT ACC TCC GTA GTC GGG ATC GAG CCC GAG CAT ATG GGG 1921 
Arg He Leu Ala Thr Ser Val Val Gly He Glu Pro Glu His Met Gly 
295 300 305 

CTC GGC CCT GCG CCC GCG ATT CGC CTG CTG CTT GCG CGT AGT GAT CTT 1969 
Leu Gly Pro Ala Pro Ala He Arg Leu Leu Leu Ala Arg Ser Asp Leu 
310 315 320 

AGT TTG AGG GAT ATC GAC CTC TTT GAG ATA AAC GAG GCG CAG GCC GCC 2 017 

Ser Leu Arg Asp He Asp Leu Phe Glu He Asn Glu Ala Gin Ala Ala 
325 330 335 

CAA GTT CTA GCG GTA CAG CAT GAA TTG GGT ATT GAG CAC TCA AAA CTT 2 0 65 

Gin Val Leu Ala Val Gin His Glu Leu Gly He Glu His Ser Lys Leu 
340 345 350 355 

AAT ATT TGG GGC GGG GCC ATT GCA CTT GGA CAC CCG CTT GCC GCG ACC 2113 
Asn He Trp Gly Gly Ala He Ala Leu Gly His Pro Leu Ala Ala Thr 
360 365 370 
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GGA TTG CGT CTC TGC ATG ACC CTC GCT CAC CAA TTG CAA GCT AAT AAC 2161 
Gly Leu Arg Leu Cys Met Thr Leu Ala His Gin Leu Gin Ala Asn Asn 
375 380 385 

TTT CGA TAT GGA ATT GCC TCG GCA TGC ATT GGT GGG GGA CAG GGG ATG 22 0 9 
Phe Arg Tyr Gly lie Ala Ser Ala Cys lie Gly Gly Gly Gin Gly Met 
390 395 400 

GCG GTT CTT TTA GAG AAT CCC CAC TTC GGT TCG TCC TCT GCA CGA AGT 2257 
Ala Val Leu Leu Glu Asn Pro His Phe Gly Ser Ser Ser Ala Arg Ser 
405 410 415 

TCG ATG ATT AAC AGA GTT GAC CAC TAT CCA CTG AGC TAA CGGGCATCTC 23 06 
Ser Met lie Asn Arg Val Asp His Tyr Pro Leu Ser 
420 425 430 431 

CTTTGTTGCT TTGAGGTGGC GCACGAAGGA GGGCTCGAAA ATCTCTGCTA AAAACAAGAA 23 6 6 

GAAGGAACAG GGAACATGAT TAGTTTCGCT CGTATGGCAG AAAGTTTAGG AGTCCAGGCT 242 6 

AAACTTGCCC TTGCCTTCGC ACTCGTATTA TGTGTCGGGC TGATTGTTAC CGGCACGGGT 248 6 

TTCTACAGTG TACATACCTT GTCAGGGTTG GTGGGAATTC 252 6 



FIG. 2p: 
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GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACT GTG AGT TGG TCA GGG GGG GCT TAC TCG GCG TTT TCC 112 
Met Ser Trp Ser Gly Gly Ala Tyr Ser Ala Phe Ser 
15 10 

GAC ACT GCG TTG GTT GCG GCA GTG CGC ACC CCC TGG ATT GAT TGC GGG 160 
Asp Thr Ala Leu Val Ala Ala Val Arg Thr Pro Trp lie Asp Cys Gly 
15 20 25 

GGT GCC CTG TCG CTG GTG TCG CCT ATC GAC TTA GGG GTA AAG GTC GCT 2 08 

Gly Ala Leu Ser Leu Val Ser Pro lie Asp Leu Gly Val Lys Val Ala 
30 35 40 

CGC GAA GTT CTG ATG CGT GCG TCG CTT GAA CCA CAA ATG GTC GAT AGC 2 56 

Arg Glu Val Leu Met Arg Ala Ser Leu Glu Pro Gin Met Val Asp Ser 
45 50 55 60 

GTA CTC GCA GGC TCT ATG GCT CAA GCA AGC TTT GAT GCT TAC CTG CTC 3 04 

Val Leu Ala Gly Ser Met Ala Gin Ala Ser Phe Asp Ala Tyr Leu Leu 
65 70 75 

CCG CGG CAC ATT GGC TTG TAC AGC GGT GTT CCC AAG TCG GTT CCG GCC 352 
Pro Arg His lie Gly Leu Tyr Ser Gly Val Pro Lys Ser Val Pro Ala 
80 85 90 

TTG GGG GTG CAG CGC ATT TGC GGC ACA GGC TTC GAA CTG CTT CGG CAG 4 00 

Leu Gly Val Gin Arg lie Cys Gly Thr Gly Phe Glu Leu Leu Arg Gin 
95 100 105 

GCC GGC GAG CAG ATT TCC CAA GGC GCT GAT CAC GTG CTG TGT GTC GCG 448 
Ala Gly Glu Gin lie Ser Gin Gly Ala Asp His Val Leu Cys Val Ala - 
110 115 120 

GCA GAG TCC ATG TCG CGT AAC CCC ATC GCG TCG TAT ACA CAC CGG GGC 496 
Ala Glu Ser Met Ser Arg Asn Pro lie Ala Ser Tyr Thr His Arg Gly 
125 130 135 140 

GGG TTC CGC CTC GGT GCG CCC GTT GAG TTC AAG GAT TTT TTG TGG GAG 544 
Gly Phe Arg Leu Gly Ala Pro Val Glu Phe Lys Asp Phe Leu Trp Glu 
145 150 155 

GCA TTG TTT GAT CCT GCT CCA GGA CTC GAC ATG ATC GCT ACC GCA GAA 592 
Ala Leu Phe Asp Pro Ala Pro Gly Leu Asp Met lie Ala Thr Ala Glu 
160 165 170 

AAC CTG GGGGAGAGGC GGTTTGCGTA TTGGGCGCAT GCATAAAAAC TGTTGTAATT 648 
Asn Leu 
174 

CATTAAGCAT TCTGCCGACA TGGAAGCCAT CACAAACGGC ATGATGAACC TGAATCGCCA 7 08 

GCGGCATCAG CACCTTGTCG CCTTGCGTAT AATATTTGCC CATGGACGCA CACCGTGGAA 7 68 

ACGGATGAAG GCACGAACCC AGTTGACATA AGCCTGTTCG GTTCGTAAAC TGTAATGCAA 828 

GTAGCGTATG CGCTCACGCA ACTGGTCCAG AACCTTGACC GAACGCAGCG GTGGTAACGG 888 
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CGCAGTGGCG GTTTTCATGG CTTGTTATGA CTGTTTTTTT GTACAGTCTA TGCCTCGGGC 948 

ATCCAAGC AGCAAGCGCG TTACGCCGTG GGTCGATGTTTG ATGTTATGGA GCAGCAACG 1007 

ATG TTA CGC AGC AGC AAC GAT GTT ACG CAG CAG GGC AGT CGC CCT AAA 1055 
Met Leu Arg Ser Ser Asn Asp Val Thr Gin Gin Gly Ser Arg Pro Lys 
15 10 15 

AC A AAG TTA GGT GGC TCA AGT ATG GGC ATC ATT CGC ACA TGT AGG CTC 1103 
Thr Lys Leu Gly Gly Ser Ser Met Gly lie lie Arg Thr Cys Arg Leu 
20 25 30 

GGC CCT GAC CAA GTC AAA TCC ATG CGG GCT GCT CTT GAT CTT TTC GGT 1151 
Gly Pro Asp Gin Val Lys Ser Met Arg Ala Ala Leu Asp Leu Phe Gly 
35 40 45 

CGT GAG TTC GGA GAC GTA GCC ACC TAC TCC CAA CAT CAG CCG GAC TCC 1199 
Arg Glu Phe Gly Asp Val Ala Thr Tyr Ser Gin His Gin Pro Asp Ser 
50 55 60 

GAT TAC CTC GGG AAC TTG CTC CGT AGT AAG ACA TTC ATC GCG CTT GCT 1247 
Asp Tyr Leu Gly Asn Leu Leu Arg Ser Lys Thr Phe lie Ala Leu Ala 
65 70 75 80 

GCC TTC GAC CAA GAA GCG GTT GTT GGC GCT CTC GCG GCT TAC GTT CTG 1295 
Ala Phe Asp Gin Glu Ala Val Val Gly Ala Leu Ala Ala Tyr Val Leu 
85 90 95 

CCC AGG TTT GAG CAG CCG CGT AGT GAG ATC TAT ATC TAT GAT CTC GCA 1343 
Pro Arg Phe Glu Gin Pro Arg Ser Glu He Tyr He Tyr Asp Leu Ala 
100 105 110 

GTC TCC GGC GAG CAC CGG AGG CAG GGC ATT GCC ACC GCG CTC ATC AAT 1391 
Val Ser Gly Glu His Arg Arg Gin Gly He Ala Thr Ala Leu He Asn 
115 120 125 

CTC CTC AAG CAT GAG GCC AAC GCG CTT GGT GCT TAT GTG ATC TAC GTG 1439 
Leu Leu Lys His Glu Ala Asn Ala Leu Gly Ala Tyr Val He Tyr Val 
130 135 140 

CAA GCA GAT TAC GGT GAC GAT CCC GCA GTG GCT CTC TAT ACA AAG TTG 1487 
Gin Ala Asp Tyr Gly Asp Asp Pro Ala Val Ala Leu Tyr Thr Lys Leu 
145 150 155 160 

GGC ATA CGG GAA GAA GTG ATG CAC TTT GAT ATC GAC CCA AGT ACC GCC 1535 
Gly He Arg Glu Glu Val Met His Phe Asp He Asp Pro Ser Thr Ala 
165 170 175 

ACC TAA CAATTCGTTC AAGCCGAGAT CGGCTTCCCA TTG AGG GCG CAA GAG GAG 1589 
Thr Leu Arg Ala Gin Glu Glu 

177 197 200 



AAA TGG ATT GAC CAA GAG ATC GTG GCT GTT ACG GAT GAA CAG TTC GAT 1637 
Lys Trp He Asp Gin Glu He Val Ala Val Thr Asp Glu Gin Phe Asp 
205 210 215 
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TTA GAG GGC TAC AAC AGT CGA GCA ATT GAA CTG CCT CGG AAG GCA AAA 1685 
Leu Glu Gly Tyr Asn Ser Arg Ala lie Glu Leu Pro Arg Lys Ala Lys 
220 225 230 

TTG TTG ATC GTG AC A GTC ATC CGC GGC CTA GCA GTC TTT GAA GCC CTT 1733 
Leu Leu lie Val Thr Val lie Arg Gly Leu Ala Val Phe Glu Ala Leu 
235 240 245 250 

TCC CGA TTG AAG CCT GTT CAT TCT GGC GGG GTG CAG ACT GCG GGC AAC 1781 
Ser Arg Leu Lys Pro Val His Ser Gly Gly Val Gin Thr Ala Gly Asn 
255 260 265 

AGC TGT GCC GTA GTG GAC GGC GCC GCG GCG GCT TTG GTG GCT CGA GAG 1829 
Ser Cys Ala Val Val Asp Gly Ala Ala Ala Ala Leu Val Ala Arg Glu 
270 275 280 

TCG TCT GCG ACA CAG CCG GTC TTG GCT AGG ATA CTG GCT ACC TCC GTA 1877 
Ser Ser Ala Thr Gin Pro Val Leu Ala Arg lie Leu Ala Thr Ser Val 
285 290 295 

GTC GGG ATC GAG CCC GAG CAT ATG GGG CTC GGC CCT GCG CCC GCG ATT 1925 
Val Gly lie Glu Pro Glu His Met Gly Leu Gly Pro Ala Pro Ala lie 
300 305 310 

CGC CTG CTG CTT GCG CGT AGT GAT CTT AGT TTG AGG GAT ATC GAC CTC 1973 
Arg Leu Leu Leu Ala Arg Ser Asp Leu Ser Leu Arg Asp lie Asp Leu 
315 320 325 330 

TTT GAG ATA AAC GAG GCG CAG GCC GCC CAA GTT CTA GCG GTA CAG CAT 2 021 

Phe Glu lie Asn Glu Ala Gin Ala Ala Gin Val Leu Ala Val Gin His 
335 340 345 

GAA TTG GGT ATT GAG CAC TCA AAA CTT AAT ATT TGG GGC GGG GCC ATT 2069 
Glu Leu Gly lie Glu His Ser Lys Leu Asn lie Trp Gly Gly Ala lie 
350 355 360 

GCA CTT GGA CAC CCG CTT GCC GCG ACC GGA TTG CGT CTC TGC ATG ACC 2117 
Ala Leu Gly His Pro Leu Ala Ala Thr Gly Leu Arg Leu Cys Met Thr 
365 370 375 

CTC GCT CAC CAA TTG CAA GCT AAT AAC TTT CGA TAT GGA ATT GCC TCG 2165 
Leu Ala His Gin Leu Gin Ala Asn Asn Phe Arg Tyr Gly lie Ala Ser 
380 385 390 

GCA TGC ATT GGT GGG GGA CAG GGG ATG GCG GTT CTT TTA GAG AAT CCC 2213 
Ala Cys lie Gly Gly Gly Gin Gly Met Ala Val Leu Leu Glu Asn Pro 
395 400 405 410 

CAC TTC GGT TCG TCC TCT GCA CGA AGT TCG ATG ATT AAC AGA GTT GAC 22 61 

His Phe Gly Ser Ser Ser Ala Arg Ser Ser Met lie Asn Arg Val Asp 
415 420 425 

CAC TAT CCA CTG AGC TAA CGGGCATCTC CTTTGTTGCT TTGAGGTGGC 23 09 

His Tyr Pro Leu Ser 
430 431 
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GC AC GAAGGA GGGCTCGAAA ATCTCTGCTA AAAACAAGAA GAAGGAACAG GGAACATGAT 23 69 

TAGTTTCGCT CGTATGGCAG AAAGTTTAGG AGTCCAGGCT AAACTTGCCC TTGCCTTCGC 2429 

ACTC GTATTA TGTGTCGGGC TGATTGTTAC CGGCACGGGT TTCTACAGTG TAC AT AC C TT 2489 

GTCAGGGTTG GTGGGAATTC 2509 
FIG. 2q: 
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GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACT GTG AGT TGG TCA GGG GGG GCT TAC TCG GCG TTT TCC 112 
Met Ser Trp Ser Gly Gly Ala Tyr Ser Ala Phe Ser 
15 10 

GAC ACT GCG TTG GTT GCG GCA GTG CGC ACC CCC TGG ATT GAT TGC GGG 160 
Asp Thr Ala Leu Val Ala Ala Val Arg Thr Pro Trp lie Asp Cys Gly 
15 20 25 

GGT GCC CTG TCG CTG GTG TCG CCT ATC GAC TTA GGG GTA AAG GTC GCT 2 08 

Gly Ala Leu Ser Leu Val Ser Pro lie Asp Leu Gly Val Lys Val Ala 
30 35 40 

CGC GAA GTT CTG ATG CGT GCG TCG CTT GAA CCA CAA ATG GTC GAT AGC 256 
Arg Glu Val Leu Met Arg Ala Ser Leu Glu Pro Gin Met Val Asp Ser 
45 50 55 60 

GTA CTC GCA GGC TCT ATG GCT CAA GCA AGC TTT GAT GCT TAC CTG CTC 3 04 

Val Leu Ala Gly Ser Met Ala Gin Ala Ser Phe Asp Ala Tyr Leu Leu 
65 70 75 

CCG CGG CAC ATT GGC TTG TAC AGC GGT GTT CCC AAG TCG GTT CCG GCC 352 
Pro Arg His lie Gly Leu Tyr Ser Gly Val Pro Lys Ser Val Pro Ala 
80 85 90 

TTG GGG GTG CAG CGC ATT TGC GGC ACA GGC TTC GAA CTG CTT CGG CAG 400 
Leu Gly Val Gin Arg lie Cys Gly Thr Gly Phe Glu Leu Leu Arg Gin 
95 100 105 

GCC GGC GAG CAG ATT TCC CAA GGC GCT GAT CAC GTG CTG TGT GTC GCG 448 
Ala Gly Glu Gin lie Ser Gin Gly Ala Asp His Val Leu Cys Val Ala 
110 115 120 

GCA GAG TCC ATG TCG CGT AAC CCC ATC GCG TCG TAT ACA CAC CGG GGC 496 
Ala Glu Ser Met Ser Arg Asn Pro lie Ala Ser Tyr Thr His Arg Gly 
125 130 135 140 

GGG TTC CGC CTC GGT GCG CCC GTT GAG TTC AAG GAT TTT TTG TGG GAG 544 
Gly Phe Arg Leu Gly Ala Pro Val Glu Phe Lys Asp Phe Leu Trp Glu 
145 150 155 

GCA TTG TTT GAT CCT GCT CCA GGA CTC GAC ATG ATC GCT ACC GCA GAA 592 
Ala Leu Phe Asp Pro Ala Pro Gly Leu Asp Met lie Ala Thr Ala Glu 
160 165 170 

AAC CTG GCG CGC A TTG AGG GCG CAA GAG GAG AAA TGG ATT GAC CAA GAG 641 
Asn Leu Ala Arg Leu Arg Ala Gin Glu Glu Lys Trp lie Asp Gin Glu 
175 176 197 200 205 * 

ATC GTG GCT GTT ACG GAT GAA CAG TTC GAT TTA GAG GGC TAC AAC AGT 689 
lie Val Ala Val Thr Asp Glu Gin Phe Asp Leu Glu Gly Tyr Asn Ser 
210 215 220 

CGA GCA ATT GAA CTG CCT CGG AAG GCA AAA TTG TTG ATC GTG ACA GTC 737 
Arg Ala lie Glu Leu Pro Arg Lys Ala Lys Leu Leu lie Val Thr Val 
225 230 235 240 
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ATC CGC GGC CTA GCA GTC TTT GAA GCC CTT TCC CGA TTG AAG CCT GTT 785 
lie Arg Gly Leu Ala Val Phe Glu Ala Leu Ser Arg Leu Lys Pro Val 
245 250 255 

CAT TCT GGC GGG GTG CAG ACT GCG GGC AAC AGC TGT GCC GTA GTG GAC 833 
His Ser Gly Gly Val Gin Thr Ala Gly Asn Ser Cys Ala Val Val Asp 
260 265 270 

GGC GCC GCG GCG GCT TTG GTG GCT CGA GAG TCG TCT GCG ACA CAG CCG 881 
Gly Ala Ala Ala Ala Leu Val Ala Arg Glu Ser Ser Ala Thr Gin Pro 
275 280 285 

GTC TTG GCT AGG ATA CTG GCT ACC TCC GTA GTC GGG ATC GAG CCC GAG 929 
Val Leu Ala Arg He Leu Ala Thr Ser Val Val Gly He Glu Pro Glu 
290 295 300 

CAT ATG GGG CTC GGC CCT GCG CCC GCG ATT CGC CTG CTG CTT GCG CGT 977 
His Met Gly Leu Gly Pro Ala Pro Ala He Arg Leu Leu Leu Ala Arg 
305 310 315 320 

AGT GAT CTT AGT TTG AGG GAT ATC GAC CTC TTT GAG ATA AAC GAG GCG 1025 
Ser Asp Leu Ser Leu Arg Asp He Asp Leu Phe Glu He Asn Glu Ala 
325 330 335 

CAG GCC GCC CAA GTT CTA GCG GTA CAG CAT GAA TTG GGT ATT GAG CAC 1073 
Gin Ala Ala Gin Val Leu Ala Val Gin His Glu Leu Gly He Glu His 
340 345 350 

TCA AAA CTT AAT ATT TGG GGC GGG GCC ATT GCA CTT GGA CAC CCG CTT 1121 
Ser Lys Leu Asn He Trp Gly Gly Ala He Ala Leu Gly His Pro Leu 
355 360 365 

GCC GCG ACC GGA TTG CGT CTC TGC ATG ACC CTC GCT CAC CAA TTG CAA 1169 
Ala Ala Thr Gly Leu Arg Leu Cys Met Thr Leu Ala His Gin Leu Gin 
370 375 380 

GCT AAT AAC TTT CGA TAT GGA ATT GCC TCG GCA TGC ATT GGT GGG GGA 1217 
Ala Asn Asn Phe Arg Tyr Gly He Ala Ser Ala Cys He Gly Gly Gly 
385 390 395 400 

CAG GGG ATG GCG GTT CTT TTA GAG AAT CCC CAC TTC GGT TCG TCC TCT 1265 
Gin Gly Met Ala Val Leu Leu Glu Asn Pro His Phe Gly Ser Ser Ser 
405 410 415 

GCA CGA AGT TCG ATG ATT AAC AGA GTT GAC CAC TAT CCA CTG AGC TAA 1313 
Ala Arg Ser Ser Met He Asn Arg Val Asp His Tyr Pro Leu Ser 
420 425 430 431 

CGGGCATCTC CTTTGTTGCT TTGAGGTGGC GCACGAAGGA GGGCTCGAAA ATCTCTGCTA 1373 

AAAACAAGAA GAAGGAACAG GGAACATGAT T AGTTTC GCT CGTATGGCAG AAAGTTTAGG 1433 

AGTCCAGGCT AAACTTGCCC TTGCCTTCGC ACTCGTATTA TGTGTCGGGC TGATTGTTAC 1493 

CGGCACGGGT TTCTACAGTG TACATACCTT GTCAGGGTTG GTGGGAATTC 1543 



FIG. 2r: 
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Sequence 1 

CTGCAGCCAG GGCTGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTCTTGG GCGCGGTCTT GGAGAGTTCG 120 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CACGGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 3 00 

GCATGGAAAT GGCATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGC GGAA 3 60 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC CGTAAAAAGG 42 0 

AAAGAGCATG CAACTGACCA ACAAGAAAAT CGTCGTCACC GGAGTGTCCT CCGGTATCGG 480 

TGCCGAAACT GCCCGCGTTC TGCGCTCTCA CGGCGCCACA GTGATTGGCG TAGATCGCAA 540 

CATGCCGAGC CTGACTCTGG ATGCTTTCGT TCAGGCTGAC CTGAGCCATC C TGAAGGC AT 600 

CGATAAGGCC ATCGGGACAG CAAGCGAACC GGAATTGCCA GCTGGGGCGC CCTCTGGTAA 660 

GGTTGGGAAG CCCTGCAAAG TAAAC TGGAT GGCTTTCTTG CCGCCAAGGA TCTGATGGCG 720 

CAGGGGATCA AGATCTGATC AAGAGACAGG ATGAGGATCG TTTCGCATGA TTGAACAAGA 780 

TGGATTGCAC GCAGGTTCTC CGGCCGCTTG GGTGGAGAGG CTATTCGGCT ATGAC TGGGC 840 

ACAACAGACA ATCGGCTGCT CTGATGCCGC CGTGTTCCGG CTGTCAGCGC AGGGGCGCCC 900 

GGTTCTTTTT GTCAAGACCG ACCTGTCCGG TGCCCTGAAT GAACTGCAGG ACGAGGCAGC 960 

GCGGCTATCG TGGCTGGCCA CGACGGGCGT TCCTTGCGCA GCTGTGCTCG AC GTTGTC AC 1020 

TGAAGCGGGA AGGGACTGGC TGCTATTGGG CGAAGTGCCG GGGCAGGATC TCCTGTCATC 1080 

TCACCTTGCT CCTGCCGAGA AAGT ATC CAT CATGGCTGAT GCAATGCGGC GGCTGCATAC 1140 

GCTTGATCCG GCTACCTGCC CATTCGACCA CCAAGCGAAA CATCGCATCG AGCGAGCACG 1200 

TACTCGGATG GAAGCCGGTC TTGTCGATCA GGATGATCTG GACGAAGAGC ATCAGGGGCT 1260 

CGCGCCAGCC GAACTGTTCG CCAGGCTCAA GGCGCGCATG CCCGACGGCG AGGATCTCGT 132 0 

C GTGACC C AT GGCGATGCCT GCTTGCCGAA TATCATGGTG GAAAATGGCC GCTTTTCTGG 138 0 

ATTCATCGAC TGTGGCCGGC TGGGTGTGGC GGACCGCTAT CAGGACATAG C GTTGGC T AC 1440 

CCGTGATATT GCTGAAGAGC TTGGCGGCGA ATGGGCTGAC CGCTTCCTCG TGCTTTACGG 15 00 

TATCGCCGCT CCCGATTCGC AGCGCATCGC CTTCTATCGC CTTCTTGACG AGTTCTTCTG 15 60 

AGCGGGACTC TGGGGTTCGA AATGAC C G AC CAAGCGACGC CCTGGCCGCG GTGATTGCAT 162 0 

TCATGTGTGC TGAGGAGTCA CGTTGGATCA AC GGC AT AAA TATTCCAGTG GACGGAGGTT 1680 

TGGCATCGAC CTACGTGTAA GTTCGTGGAC GCCCTTTGCA CGCGCACTAT ATCTCTATGC 1740 

AGCAGCTGAA AGCAGCTTTG GTTTTGATCG GAGGTAGC GG GCGGAAAGGT GCAGAATGTC 1800 

TAAATAATAA AGGATTCTTG TGAAGCTTTA GTTGTCCGTA AACGAAAATA AAAATAAAGA 186 0 

GGAATGATAT GAAAGCAAGT AGATCAGTCT GCACTTTCAA AATAGCTACC CTGGCAGGCG 192 0 

CCATTTATGC AGCGCTGCCA ATGTCAGCTG CAAACTCGAT GCAGC TGGAT GTAGGTAGCT 1980 

CGGATTGGAC GGTGCGTTGG GGACAACACC CTCAAGTATA GCCTTGCCTC TCGCCTGAAT 2040 

GAGCAAGACT CAAGTCTGAC AAATGCGCCG ACTGTCAATG GTTATATCCG GATATTCAAA 210 0 

GTCAGGGTGA TCGTAACTTT GACCGGGGGC TTGGTATCCA ATCGTCTCGA TATTCTGGCT 2160 

GCAG 2164 



-53- 



Sequence 2 

CTGCAGCCAG GGCTGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTCTTGG GCGCGGTCTT GGAGAGTTCG 12 0 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CACGGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 3 00 

GCATGGAAAT GGCATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGCGGAA 3 60 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC CGTAAAAAGG 420 

AAAGAGCATG CAACTGACCA ACAAGAAAAT CGTCGTCACC GGAGTGTCCT CCGGTATCGG 48 0 

TGCCGAAACT GCCCGCGTTC TGCGCTCTCA CGGCGCCACA GTGATTGGCG TAGATCGCAA 540 

CATGCCGAGC CTGACTCTGG ATGCTTTCGT TCAGGCTGAC CTGAGCCATC CTGAGGGGAG 600 

AGGCGGTTTG CGTATTGGGC GCATGCATAA AAACTGTTGT AATTCATTAA GCATTCTGCC 66 0 

GACATGGAAG CCATCACAAA CGGCATGATG AACCTGAATC GCCAGCGGCA TCAGCACCTT 72 0 

GTCGCCTTGC GTATAATATT TGCCCATGGA CGCACACCGT GGAAACGGAT GAAGGCACGA 780 

ACCCAGTTGA CATAAGCCTG TTCGGTTCGT AAACTGTAAT GCAAGTAGCG TATGCGCTCA 840 

CGCAAC TGGT CCAGAACCTT GACCGAACGC AGCGGTGGTA ACGGCGCAGT GGCGGTTTTC 900 

ATGGCTTGTT ATGACTGTTT TTTTGTACAG TCTATGCCTC GGGCATCCAA GCAGCAAGCG 960 

CGTTACGCCG TGGGTC GATG TTTGATGTTA TGGAGCAGCA ACGATGTTAC GCAGCAGCAA 102 0 

CGATGTTACG CAGCAGGGCA GTCGCCCTAA AACAAAGTTA GGTGGCTCAA GTATGGGCAT 1080 

CATTCGCACA TGTAGGCTCG GCCCTGACCA AGTCAAATCC ATGCGGGCTG CTCTTGATCT 114 0 

TTTCGGTCGT GAGTTCGGAG ACGTAGCCAC CTACTCCCAA CATCAGCCGG ACTCCGATTA 12 00 

CCTCGGGAAC TTGCTCCGTA GTAAGACATT CATCGCGCTT GCTGCCTTCG ACCAAGAAGC 1260 

GGTTGTTGGC GCTCTCGCGG CTTACGTTCT GCCCAGGTTT GAGCAGCCGC GTAGTGAGAT 132 0 

CTATATCTAT GATCTCGCAG TCTCCGGCGA GCACCGGAGG CAGGGCATTG CCACCGCGCT 13 80 

CATCAATCTC CTCAAGCATG AGGCCAACGC GCTTGGTGCT TATGTGATCT ACGTGCAAGC 1440 

AGATTACGGT GACGATCCCG CAGTGGCTCT CTATACAAAG TTGGGCATAC GGGAAGAAGT 1500 

GATGCACTTT GATATCGACC CAAGTACCGC CACCTAACAA TTCGTTCAAG CCGAGATCGG 1560 

CTTCCCTGAT TGCATTCATG TGTGCTGAGG AGTCACGTTG GATCAACGGC ATAAATATTC 1620 

CAGTGGACGG AGGTTTGGCA TCGACCTACG TGTAAGTTCG TGGACGCCCT TTGCACGCGC 1680 

ACTATATCTC TATGCAGCAG CTGAAAGCAG CTTTGGTTTT GATCGGAGGT AGC GGGCGGA 1740 

AAGGTGCAGA ATGTCTAAAT AATAAAGGAT TCTTGTGAAG CTTTAGTTGT CCGTAAACGA 1800 

AAATAAAAAT AAAGAGGAAT GATATGAAAG CAAGTAGATC AGTCTGCACT TTCAAAATAG 18 60 

CTACCCTGGC AGGCGCCATT TATGCAGCGC TGCCAATGTC AGCTGCAAAC TCGATGCAGC 192 0 

TGGATGTAGG TAGCTCGGAT TGGACGGTGC GTTGGGGACA ACACCCTCAA GTATAGCCTT 1980 

GCCTCTCGCC TGAATGAGCA AGACTCAAGT CTGACAAATG CGCCGACTGT CAATGGTTAT 2 040 

ATCCGGATAT TCAAAGTCAG GGTGATCGTA ACTTTGACCG GGGGCTTGGT ATCCAATCGT 2100 

CTCGATATTC TGGCTGCAG 2119 
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Sequence 3 

CTGCAGCCAG GGCTGAAAAG GAGGGATTCA GTGAGGTCAT GAAGGGAGGG GACGGCGCCT 60 

GGCTCCAATT GCTCGATGGC GCCGCGATTG AGTGTCTTGG GCGCGGTCTT GGAGAGTTCG 120 

GCTAGGGAGA TAAATTTGCT GGCCATGGTG GCGGCCCCTG ATGGGTTGGA TGATTTTCTG 180 

CATTCTGCAT CATGAAATTC ATGAAATCAT CACTTTTCGG GGGGTGGGTG CACGGGATTG 240 

AAGGTTGCTA GGAGAGTGCA TTGCTCGTAA GCCCAGGAAG CACGCGGGTT TCAGGATGGT 300 

GCATGGAAAT GGC ATGAGCT TTGCTGGATA TGATTAGAGA CATTAACTAT TTTGGCGGAA 3 60 

TGGAAGCACG ATTCCTCGCC CGGTAGAGCG GTAACCGCGA CATTCAGGAC CGTAAAAAGG 42 0 

AAAGAGCATG CAACTGACCA ACAAGAAAAT CGTCGTCACC GGAGTGTCCT CCGGTATCGG 480 

TGCCGAAACT GCCCGCGTTC TGCGCTCTCA CGGCGCCACA GTGATTGGCG TAGATCGCAA 540 

CATGCCGAGC CTGACTCTGG ATGCTTTCGT TCAGGCTGAC CTGAGCCATC CTGAAGGCAT 600 

CGATCAACGG CATAAATATT CCAGTGGACG GAGGTTTGGC ATCGACCTAC GTGTAAGTTC 660 

GTGGACGCCC TTTGCACGCG CACTATATCT CTATGCAGCA GCTGAAAGCA GCTTTGGTTT 72 0 

TGATCGGAGG TAGCGGGCGG AAAGGTGCAG AATGTCTAAA TAATAAAGGA TTCTTGTGAA 780 

GCTTTAGTTG TCCGTAAACG AAAATAAAAA TAAAGAGGAA TGATATGAAA GCAAGTAGAT 840 

CAGTCTGCAC TTTCAAAATA GCTACCCTGG CAGGCGCCAT TTATGCAGCG CTGCCAATGT 900 

CAGCTGCAAA CTCGATGCAG CTGGATGTAG GTAGCTCGGA TTGGACGGTG CGTTGGGGAC 960 

AACACCCTCA AGTATAGCCT TGCCTCTCGC CTGAATGAGC AAGACTCAAG TCTGACAAAT 1020 

GCGCCGACTG TCAATGGTTA TATC CGGATA TTCAAAGTCA GGGTGATCGT AACTTTGACC 1080 

GGGGGCTTGG TATCCAATCG TCTC GAT ATT CTGGCTGCAG 1120 
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Sequence 4 

GAATTCCGCG TATCGCCCGG TTCTATCAGC GGGCCGCTTT CGAAAGTCAT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TCGTTGACAT AGGGCAGAGG 12 0 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 180 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGCGGCG TCGAAGC GAT GCTCCACTAC 3 00 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 360 

CTCCAGCTCA AGGGCAATTT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 42 0 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CGATGAGCAT TCTTGGTTTG AATGGTGCCC 480 

CGGTCGGAGC TGAGCAGCTG GGCTCGGCTC TTGATCGCAT GAAGAAGGCG CACCTGGAGC 540 

AGGGGCCTGC AAACTTGGAG CTGCGTCTGA GTAGGCTGGA TCGTGCGATT GCAATGCTTC 600 

TGGAAAATCG TGAAGCAATT GCCGACGCGG TTTCTGCTGA CTTTGGCAAT CGCAGCCGTG 660 

AGCAAACACT GCTTTGCGAC ATTGCTGGCT CGGTGGCAAG CCTGAAGGAT AGCCGCGAGC 720 

ACGTGGCCAA ATGGATGGAG CCCGAACATC AC AAGGC GAT GTTTCCAGGG GCGGAGGCAC 7 80 

GCGTTGAGTT TCAGCCGCTG GGTGTCGTTG GGGTCATTAG TCCCTGGAAC TTCCCTATCG 840 

TACTGGCCTT TGGGCCGCTG GCCGGCATAT TCGCAGCAGG TAATCGCGCC ATGCTCAAGC 900 

CGTCCGAGCT TACCCCGCGG ACTTCTGCCC TGCTTGCGGA GCTAATTGCT CGTTACTTCG 960 

ATGAAACTGA GCTGACTACA GTGCTGGGCG ACGCTGAAGT CGGTGCGCTG TTCAGTGCTC 1020 

AGCCTTTCGA TCATCTGATC TTCACCGGCG GCACTGCCGT GGCCAAGCAC ATCATGCGTG 1080 

CCGCGGCGGA TAACCTAGTG CCCGTTACCC TGGAATTGGG TGGCAAATCG CCGGTGATCG 1140 

TTTCCCGCAG TGCAGATATG GCGGACGTTG CACAACGGGT GTTGACGGTG AAAACCTTCA 1200 

ATGCCGGGCA AATCTGTCTG GCACCGGACT ATGTGCTGCT GCCGGAAGGG ACAGCAAGCG 1260 

AACCGGAATT GCCAGCTGGG GCGCCCTCTG GTAAGGTTGG GAAGCCCTGC AAAGTAAACT 13 2 0 

GGATGGCTTT CTTGCCGCCA AGGATCTGAT GGCGCAGGGG ATCAAGATCT GATCAAGAGA 1380 

CAGGATGAGG ATCGTTTCGC ATGATTGAAC AAGATGGATT GCACGCAGGT TCTCCGGCCG 1440 

CTTGGGTGGA GAGGCTATTC GGCTATGACT GGGCACAACA GACAATCGGC TGCTCTGATG 1500 

CCGCCGTGTT CCGGCTGTCA GCGCAGGGGC GCCCGGTTCT TTTTGTCAAG ACCGACCTGT 1560 

CCGGTGCCCT GAATGAACTG CAGGACGAGG CAGCGCGGCT ATCGTGGCTG GCCACGACGG 162 0 

GCGTTCCTTG CGCAGCTGTG CTCGACGTTG TCACTGAAGC GGGAAGGGAC TGGCTGCTAT 168 0 

TGGGCGAAGT GCCGGGGCAG GATCTCCTGT CATCTCACCT TGCTCCTGCC GAGAAAGTAT 1740 

CCATCATGGC TGATGCAATG CGGCGGCTGC ATACGCTTGA TCCGGCTACC TGCCCATTCG 180 0 

ACCACCAAGC GAAACATCGC ATCGAGCGAG CACGTACTCG GATGGAAGCC GGTCTTGTCG 1860 

ATCAGGATGA TCTGGAC GAA GAGCATCAGG GGCTCGCGCC AGCCGAACTG TTCGCCAGGC 192 0 

TCAAGGCGCG CATGCCCGAC GGCGAGGATC TCGTCGTGAC CCATGGCGAT GCCTGCTTGC 198 0 

C GAATATC AT GGTGGAAAAT GGCCGCTTTT CTGGATTCAT CGACTGTGGC CGGCTGGGTG 2040 

TGGCGGACCG CTATCAGGAC ATAGCGTTGG CTACCCGTGA TATTGC TGAA GAGCTTGGCG 2100 

GCGAATGGGC TGACCGCTTC CTCGTGCTTT ACGGTATCGC CGCTCCCGAT TCGCAGCGCA 2160 
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TCGCCTTCTA TCGCCTTCTT GACGAGTTCT TCTGAGCGGG ACTCTGGGGT TCGAAATGAC 2220 

CGACCAAGCG ACGCCCGCCA TGCCAAGCCT GTTCTCGTGC AAAGTCCTGT GGGTGAGTCG 22 80 

AACTTGGCGA TGCGCGCACC CTACGGAGAA GCGATCCACG GACTGCTCTC TGTCCTCCTT 2340 

TCAACGGAGT GTTAGAACCG TTGGTAGTGG TTTTGGACGG GCCCAGGAGC ATGCGCTTCT 2400 

GGGCCCGTTT CTTGAGTATT CATTGGATAG TCACGCGTGG TAGCTTCGAG CCTGCACAGC 2460 

TGATGAGCAC CCTGGAAGGC GCGCTGTACG CGGACGACTG GGTTCATCTT CGCCATTCAT 252 0 

GACGGAACTC CGTTCCCCAG TACCGCGATG ACTATTTTGC CTCTTCCGAT GTCCGATTCC 2580 

ACGCCGCCTG ACGCTAAGCG GGGGCGGGGG CGCCCGCATC CCAGCCCAGA CAGCAACAAA 2640 

TGAGTAGGCT CTTGGATGCC GCGGCGGCTG AGATTGGTAA CGGCAATTTC GTCAATGTGA 2700 

CGATGGATTC GATTGCCCGT GCTGCCGGCG TCTCAAAAAA AACGCTGTAC GTCTTGGTGG 276 0 

CGAGCAAGGA AGAACTCATT TCCCGGTTAG TGGCTCGAGA CATGTCCAAC CTTGAGGAAT 2 82 0 

TC 2822 
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Sequence 5 

GAATTCCGCG TATCGCCCGG TTCTATCAGC GGGCCGCTTT CGAAAGTCAT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TCGTTGACAT AGGGCAGAGG 120 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 180 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGCGGCG TCGAAGCGAT GCTCCACTAC 300 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 360 

CTCCAGCTCA AGGGCAATTT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 420 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CGATGAGCAT TCTTGGTTTG AATGGTGCCC 480 

CGGTCGGAGC TGAGCAGCTG GGCTCGGCTC TTGATCGCAT GAAGAAGGCG CACCTGGAGC 540 

AGGGGCCTGC AAACTTGGAG CTGCGTCTGA GTAGGCTGGA TCGTGCGATT GCAATGCTTC 600 

TGGAAAATCG TGAAGCAATT GCCGACGCGG TTTCTGCTGA CTTTGGCAAT CGCAGCCGTG 660 

AGCAAACACT GCTTTGCGAC ATTGCTGGCT CGGTGGCAAG CCTGAAGGAT AGCCGCGAGC 720 

ACGTGGCCAA ATGGATGGAG CCCGAACATC AC AAGGC GAT GTTTC C AGGG GC GGAGGC AC 78 0 

GCGTTGAGTT TCAGCCGCTG GGTGTCGTTG GGGTCATTAG TCCCTGGAAC TTCCCTATCG 840 

TACTGGCCTT TGGGCCGCTG GCCGGCATAT TCGCAGCAGG TAATCGCGCC ATGCTCAAGC 900 

CGTCCGAGCT TACCCCGCGG ACTTCTGCCC TGCTTGCGGA GCTAATTGCT CGTTACTTCG 960 

ATGAAACTGA GCTGACTACA GTGCTGGGCG ACGCTGAAGT CGGTGCGCTG TTCAGTGCTC 102 0 

AGCCTTTCGA TCATCTGATC TTCACCGGCG GCACTGCCGT GGCCAAGCAC ATCATGCGTG 1080 

CCGCGGCGGA TAACCTAGTG CCCGTTACCC TGGAATTGGG TGGCAAATCG CCGGTGATCG 1140 

TTTCCCGCAG TGCAGATATG GCGGACGTTG CACAACGGGT GTTGACGGTG AAAACCTTCA 1200 

ATGCC GGGC A AATCTGTCTG GCACCGGACT ATGTGCTGGG GGAGAGGCGG TTTGCGTATT 1260 

GGGCGCATGC AT AAAAAC TG TTGTAATTCA TTAAGCATTC TGCCGACATG G AAGC CATC A 13 20 

CAAACGGCAT GATGAACCTG AATCGCCAGC GGCATCAGCA CCTTGTCGCC TTGCGTATAA 13 80 

TATTTGCCCA TGGAC GC AC A CCGTGGAAAC GGATGAAGGC ACGAACCCAG TTGACATAAG 1440 

CCTGTTCGGT TCGTAAACTG TAATGCAAGT AGCGTATGCG CTCACGCAAC TGGTCCAGAA 1500 

CCTTGACCGA ACGCAGCGGT GGTAACGGCG CAGTGGCGGT TTTCATGGCT TGTTATG AC T 1560 

GTTTTTTTGT ACAGTCTATG CCTCGGGCAT CCAAGCAGCA AGCGCGTTAC GCCGTGGGTC 1620 

GATGTTTGAT GTTATGGAGC AGCAACGATG TTACGCAGCA GCAACGATGT TACGCAGCAG 1680 

GGCAGTCGCC CTAAAACAAA GTTAGGTGGC TCAAGTATGG GCATCATTCG CACATGTAGG 1740 

CTCGGCCCTG ACCAAGTCAA ATCCATGCGG GCTGCTCTTG ATCTTTTCGG TCGTGAGTTC 1800 

GGAGACGTAG CCACCTACTC CCAACATCAG CCGGACTCCG ATTACCTCGG GAACTTGCTC 186 0 

CGTAGTAAGA CATTCATCGC GCTTGCTGCC TTCGACCAAG AAGC GGTTGT TGGCGCTCTC 192 0 

GCGGCTTACG TTCTGCCCAG GTTTGAGCAG CCGCGTAGTG AGATCTATAT CTATGATCTC 1980 

GCAGTCTCCG GCGAGCACCG GAGGCAGGGC ATTGCCACCG CGCTCATCAA TCTCCTCAAG 2 040 

CATGAGGCCA ACGCGCTTGG TGCTTATGTG ATCTACGTGC AAGCAGATTA CGGTGACGAT 2100 

CCCGCAGTGG CTCTCTATAC AAAGTTGGGC ATAC GGGAAG AAGTGATGCA CTTTGATATC 2160 
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GACCCAAGTA CCGCCACCTA AC AATTC GTT C AAGCC GAGA TCGGCTTCCC TGCAAAGTCC 2220 

TGTGGGTGAG TCGAACTTGG CGATGCGCGC ACCCTACGGA GAAGCGATCC ACGGACTGCT 2280 

CTCTGTCCTC CTTTCAACGG AGTGTTAGAA CCGTTGGTAG TGGTTTTGGA CGGGCCCAGG 234 0 

AGCATGCGCT TCTGGGCCCG TTTCTTGAGT ATTCATTGGA TAGTCACGCG TGGTAGC TTC 2400 

GAGCCTGCAC AGCTGATGAG CACCCTGGAA GGCGCGCTGT AC GC GGACGA CTGGGTTCAT 2460 

CTTCGCCATT CATGACGGAA CTCCGTTCCC CAGTACCGCG ATGACTATTT TGCCTCTTCC 2520 

GATGTCCGAT TCCACGCCGC CTGACGCTAA GCGGGGGCGG GGGCGCCCGC ATCCCAGCCC 2 580 

AGACAGCAAC AAATGAGTAG GCTCTTGGAT GCCGCGGCGG CTGAGATTGG TAACGGCAAT 2 640 

TTCGTCAATG TGACGATGGA TTCGATTGCC CGTGCTGCCG GCGTCTCAAA AAAAACGCTG 27 00 

TACGTCTTGG TGGCGAGCAA GGAAGAACTC ATTTCCCGGT TAGTGGCTCG AGACATGTCC 27 60 

AACCTTGAGG AATTC 2775 
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Sequence 6 

GAATTCCGCG TATCGCCCGG TTC TATCAGC GGGCCGCTTT C GAAAGTC AT GGTGTTAGCC 60 

GGTAGGGTCT TTTTCTTGGC CATGCTTGTT GCCTGAACCT TCGTTGACAT AGGGCAGAGG 12 0 

TGCGTTTGCC GCTTCGCTTC GCGATGAACC GCATCGAGAT GCTGAGGTCA GGATTTTTCC 18 0 

TTAACTCGCG TAAGCATTCT GTCATTTTTT TGGTGGCTTT GAACAGCCTG ATGAAAGGTG 240 

GTCTCGCCCT TTGAGGCCGA TTCTTGGGCG CTTGGCGGCG TCGAAGCGAT GCTCCACTAC 3 00 

CGATTAAGAT AATTAAAATA AGGAAACCGC ATGGTTTCTT ATGTGAATTT GTCTGGCATA 3 60 

CTCCAGCTCA AGGGCAATTT TTGGGCTATT GGCTGAGCAG TTGCCTCTAT ATGGTTATTC 42 0 

AGAATAACAA TTGACTCCTC AGGAGGTCAG CGATGAGCAT TCTTGGTTTG AATGGTGCCC 480 

CGGTCGGAGC TGAGCAGCTG GGCTCGGCTC TTGATCGCAT GAAGAAGGCG CACCTGGAGC 540 

AGGGGCCTGC AAACTTGGAG CTGCGTCTGA GTAGGCTGGA TCGTGCGATT GCAATGCTTC 600 

TGGAAAATCG TGAAGCAATT GCCGACGCGG TTTCTGCTGA CTTTGGCAAT CGCAGCCGTG 660 

AGCAAACACT GCTTTGCGAC ATTGCTGGCT CGGTGGCAAG CCTGAAGGAT AGCCGCGAGC 72 0 

ACGTGGCCAA ATGGATGGAG CCCGAACATC ACAAGGCGAT GTTTCCAGGG GC GGAGGC AC 78 0 

GCGTTGAGTT TCAGCCGCTG GGTGTCGTTG GGGTCATTAG TCCCTGGAAC TTCCCTATCG 840 

TACTGGCCTT TGGGCCGCTG GCCGGCATAT TCGCAGCAGG TAATCGCGCC ATGCTCAAGC 9 00 

CGTCCGAGCT TACCCCGCGG ACTTCTGCCC TGCTTGCGGA GCTAATTGCT CGTTACTTCG 9 60 

ATGAAAC TGA GCTGACTACA GTGCTGGGCG AC GC TGAAGT CGGTGCGCTG TTCAGTGCTC 1020 

AGCCTTTCGA TCATCTGATC TTCACCGGCG GCACTGCCGT GGCCAAGCAC ATCATGCGTG 1080 

CCGCGGCGGA TAACCTAGTG CCCGTTACCC TGGAATTGGG TGGCAAATCG CCGGTGATCG 1140 

TTTCCCGCAG TGCAGATATG GCGGACGTTG CACAACGGGT GTTGACGGTG AAAACCTTCA 1200 

ATGCCGGGCA AATCTGTCTG GCACCGTGGG TGAGTC GAAC TTGGC GATGC GCGCACCCTA 12 60 

CGGAGAAGCG ATCCACGGAC TGCTCTCTGT CCTCCTTTCA ACGGAGTGTT AGAACCGTTG 132 0 

GTAGTGGTTT TGGACGGGCC CAGGAGCATG CGCTTCTGGG CCCGTTTCTT GAGTATTCAT 1380 

TGGATAGTCA CGCGTGGTAG CTTCGAGCCT GCACAGCTGA TGAGCACCCT GGAAGGC GC G 1440 

CTGTACGCGG ACGAC TGGGT TCATCTTCGC CATTCATGAC GGAACTCCGT TCCCCAGTAC 150 0 

CGCGATGACT ATTTTGCCTC TTCCGATGTC CGATTCCACG CCGCCTGACG CTAAGCGGGG 1560 

GCGGGGGCGC CCGCATCCCA GCCCAGACAG CAACAAATGA GTAGGCTCTT GGATGCCGCG 1620 

GCGGCTGAGA TTGGTAACGG CAATTTCGTC AATGTGACGA TGGATTCGAT TGCCCGTGCT 1680 

GCCGGCGTCT CAAAAAAAAC GCTGTACGTC TTGGTGGCGA GCAAGGAAGA ACTCATTTCC 1740 

CGGTTAGTGG CTCGAGACAT GTCCAACCTT GAGGAATTC 1779 
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Sequence 7 

CTGCAGCCGA GCATCGATTG AGCACTTTAC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 12 0 

CTCGCCTCAT TTCAATCTCT AACTTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 18 0 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGCCGAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 3 00 

CTCCGGTTGA GGTTACGCAA GACGCTGGAG GTATTGTCCG GATGCGTTCT CTCGAGGCGC 3 60 

TTCTTCCCTT CCCGGGTCGA ATTCTTGAGC GTCTCGAGCA TTGGGCTAAG ACCCGTCCAG 42 0 

AACAAACCTG CGTTGCTGCC AGGGCGGCAA ATGGGGAATG GCGTCGTATC AGCTACGCGG 480 

AAATGTTCCA CAACGTCCGC GCCATCGCAC AGAGCTTGCT TCCTTACGGA CTATCGGCAG 540 

AGCGTCCGCT GCTTATCGTC TCTGGAAATG ACCTGGAACA TCTTCAGCTG GCATTTGGGG 600 

CTATGTATGC GGGCATTCCC TATTGCCCGG TGTCTCCTGC TTATTCACTG CTGTCGCAAG 6 60 

ATTTGGCGAA GCTGCGTCAC ATCGTAGGTC TTCTGCAACC GGGACTGGTC TTTGCTGCCG 720 

ATGCAGCACC TTTCCAGGGG ACAGCAAGCG AACCGGAATT GCCAGCTGGG GCGCCCTCTG 7 80 

GTAAGGTTGG GAAGCCCTGC AAAGTAAACT GGATGGCTTT CTTGCCGCCA AGGATCTGAT 840 

GGCGCAGGGG ATCAAGATCT GATCAAGAGA CAGGATGAGG ATCGTTTCGC ATGATTGAAC 9 00 

AAGATGGATT GCACGCAGGT TCTCCGGCCG CTTGGGTGGA GAGGCTATTC GGCTATGACT 9 60 

GGGCACAACA GACAATCGGC TGCTCTGATG CCGCCGTGTT CCGGCTGTCA GCGCAGGGGC 102 0 

GCCCGGTTCT TTTTGTCAAG ACCGACCTGT CCGGTGCCCT GAATGAACTG CAGGACGAGG 108 0 

CAGCGCGGCT ATCGTGGCTG GCCACGACGG GCGTTCCTTG CGCAGCTGTG CTCGACGTTG 114 0 

TCACTGAAGC GGGAAGGGAC TGGCTGCTAT TGGGCGAAGT GCCGGGGCAG GATCTCCTGT 12 00 

CATCTCACCT TGCTCCTGCC GAGAAAGTAT CCATCATGGC TGATGCAATG CGGCGGCTGC 12 60 

ATACGCTTGA TCCGGCTACC TGCCCATTCG ACCACCAAGC GAAACATCGC ATCGAGCGAG 13 2 0 

CACGTACTCG GATGGAAGCC GGTCTTGTCG ATCAGGATGA TCTGGACGAA GAGCATCAGG 13 80 

GGCTCGCGCC AGCCGAACTG TTCGCCAGGC TCAAGGCGCG CATGCCCGAC GGCGAGGATC 144 0 

TCGTCGTGAC CCATGGCGAT GCCTGCTTGC CGAATATCAT GGTGGAAAAT GGCCGCTTTT 1500 

CTGGATTCAT CGACTGTGGC CGGCTGGGTG TGGCGGACCG CTATCAGGAC ATAGCGTTGG 1560 

CTACCCGTGA TATTGCTGAA GAGCTTGGCG GCGAATGGGC TGACCGCTTC CTCGTGCTTT 162 0 

ACGGTATCGC CGCTCCCGAT TCGCAGCGCA TCGCCTTCTA TCGCCTTCTT GACGAGTTCT 1680 

TCTGAGCGGG ACTCTGGGGT TCGAAATGAC CGACCAAGCG ACGCCCCTGT TTTGCAATGG 1740 

CGGTCGGCGA AAGTTGATGC GCTGTATCGT GGTGAAGATC AATCCATGCT GCGTGACGAG 1800 

GCCACACTGT GAGTTGGTCA GGGGGGGCTT ACTCGGCGTT TTCCGACACT GCGTTGGTTG 1860 

CGGCAGTGCG CACCCCCTGG ATTGATTGCG GGGGTGCCCT GTCGCTGGTG TCGCCTATCG 192 0 

ACTTAGGGGT AAAGGTCGCT CGCGAAGTTC TGATGCGTGC GTCGCTTGAA CCACAAATGG 1980 

TCGATAGCGT ACTCGCAGGC TCTATGGCTC AAGCAAGCTT TGATGCTTAC CTGCTCCCGC 2 040 

GGCACATTGG CTTGTACAGC GGTGTTCCCA AGTCGGTTCC GGCCTTGGGG GTGCAGCGCA 2100 

TTTGCGGCAC AGGCTTCGAA CTGCTTCGGC AGGCCGGCGA GC AG ATTTC C CAAGGCGCTG 2160 

ATCACGTGCT GTGTGTCGCG GGCTGCAG 2188 
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Sequence 8 

CTGCAGCCGA GCATCGATTG AGC AC TTTAC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 120 

CTCGCCTCAT TTCAATCTCT AACTTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 180 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGCCGAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 300 

CTCCGGTTGA GGTTACGCAA GACGCTGGAG GTATTGTCCG GATGCGTTCT CTCGAGGCGC 360 

TTCTTCCCTT CCCGGGTCGA ATTCTTGAGC GTCTCGAGCA TTGGGCTAAG ACCCGTCCAG 42 0 

AACAAACCTG CGTTGCTGCC AGGGCGGCAA ATGGGGAATG GCGTCGTATC AGCTACGCGG 480 

AAATGTTCCA CAACGTCCGC GCCATCGCAC AGAGCTTGCT TCCTTACGGA CTATCGGCAG 540 

AGCGTCCGCT GCTTATCGTC TCTGGAAATG ACCTGGAACA TCTTCAGCTG GCATTTGGGG 600 

CTATGTATGC GGGCATTCCC TATTGCCCGG TGTCTCCTGC TTATTCACTG CTGTCGCAAG 660 

ATTTGGCGAA GCTGCGTCAC ATCGTAGGTC TTCTGCAACC GGGACTGGTC TTTGC TGCCG 72 0 

ATGCAGCACC TTTCCAGGGG GAGAGGCGGT TTGCGTATTG GGCGCATGCA TAAAAACTGT 780 

TGTAATTCAT TAAGCATTCT GCCGACATGG AAGCCATCAC AAACGGCATG ATGAAC CTGA 840 

ATCGCCAGCG GCATCAGCAC CTTGTCGCCT TGCGTATAAT ATTTGCCCAT GGACGCACAC 900 

CGTGGAAACG GATGAAGGCA CGAACCCAGT TGACATAAGC CTGTTCGGTT CGTAAACTGT 960 

AATGCAAGTA GCGTATGCGC TCACGCAACT GGTCCAGAAC CTTGACCGAA CGCAGCGGTG 1020 

GTAACGGCGC AGTGGCGGTT TTCATGGCTT GTTATGACTG TTTTTTTGTA CAGTCTATGC 1080 

CTCGGGCATC CAAGCAGCAA GCGCGTTACG CCGTGGGTCG ATGTTTGATG TTATGGAGCA 1140 

GCAACGATGT TACGCAGCAG CAACGATGTT ACGCAGCAGG GCAGTCGCCC TAAAACAAAG 1200 

TTAGGTGGCT CAAGTATGGG CATCATTCGC ACATGTAGGC TCGGCCCTGA CCAAGTCAAA 1260 

TCCATGCGGG CTGCTCTTGA TCTTTTCGGT CGTGAGTTCG GAGACGTAGC CACCTACTCC 1320 

CAACATCAGC CGGACTCCGA TTACCTCGGG AACTTGCTCC GTAGTAAGAC ATTCATCGCG 13 80 

CTTGCTGCCT TCGACCAAGA AGCGGTTGTT GGCGCTCTCG CGGCTTACGT TCTGCCCAGG 1440 

TTTGAGCAGC CGCGTAGTGA GATCTATATC TATGATC TCG CAGTCTCCGG CGAGCACCGG 1500 

AGGCAGGGCA TTGCCACCGC GCTCATCAAT CTCCTCAAGC ATGAGGC C AA CGCGCTTGGT 1560 

GCTTATGTGA TCTACGTGCA AGCAGATTAC GGTGACGATC CCGCAGTGGC TC TC TAT AC A 1620 

AAGTTGGGCA TACGGGAAGA AGTGATGCAC TTTGATATCG ACCCAAGTAC CGCCACCTAA 1680 

CAATTCGTTC AAGCCGAGAT CGGCTTCCCC TGTTTTGCAA TGGCGGTCGG C GAAAGTTGA 1740 

TGCGCTGTAT CGTGGTGAAG ATC AATC CAT GCTGCGTGAC GAGGCCACAC TGTGAGTTGG 1800 

TCAGGGGGGG CTTACTCGGC GTTTTCCGAC ACTGCGTTGG TTGCGGCAGT GCGCACCCCC 1860 

TGGATTGATT GCGGGGGTGC CCTGTCGCTG GTGTCGCCTA TCGACTTAGG GGTAAAGGTC 1920 

GCTCGCGAAG TTCTGATGCG TGCGTCGCTT G AAC C AC AAA TGGTC GAT AG CGTACTCGCA 1980 

GGCTC TATGG CTCAAGCAAG CTTTGATGCT TACCTGCTCC CGCGGCACAT TGGCTTGTAC 2 040 

AGCGGTGTTC CCAAGTCGGT TCCGGCCTTG GGGGTGCAGC GCATTTGCGG CACAGGCTTC 2100 

GAACTGCTTC GGCAGGCCGG CGAGCAGATT TCCCAAGGCG CTGATCACGT GCTGTGTGTC 2160 

GCGGGCTGCA G 2171 
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Sequence 9 

CTGCAGCCGA GCATCGATTG AGCACTTTAC CCAGCTGCGC TGGCTGACCA TTCAGAATGG 60 

CCCGCGGCAC TATCCAATCT AAATCGATCT TCGGGCGCCG CGGGCATCAT GCCCGCGGCG 12 0 

CTCGCCTCAT TTCAATCTCT AACTTGATAA AAACAGAGCT GTTCTCCGGT CTTGGTGGAT 180 

CAAGGCCAGT CGCGGAGAGT CTCGAAGAGG AGAGTACAGT GAACGCCGAG TCCACATTGC 240 

AACCGCAGGC ATCATCATGC TCTGCTCAGC CACGCTACCG CAGTGTGTCG ATTGGTCATC 3 00 

CTCCGGTTGA GGTTACGCAA GACGC TGGAG GTATTGTCCG GATGCGTTCT CTCGAGGCGC 3 60 

TTCTTCCCTT CCCGGGTCGA ATTCTTGAGC GTCTCGAGCA TTGGGCTAAG ACCCGTCCAG 420 

AACAAACCTG CGTTGCTGCC AGGGCGGCAA ATGGGGAATG GCGTCGTATC AGCTACGCGG 4 80 

AAATGTTCCA CAACGTCCGC GCCATCGCAC AGAGCTTGCT TCCTTACGGA CTATCGGCAG 540 

AGCGTCCGCT GCTTATCGTC TCTGGAAATG ACCTGGAACA TCTTCAGCTG GCATTTGGGG 6 00 

CTATGTATGC GGGCATTCCC TATTGCCCGG TGTCTCCTGC TTATTCACTG CTGTCGCAAG 6 60 

ATTTGGCGAA GCTGCGTCAC ATCGTAGGTC TTCTGCAACC GGGACTGGTC TTTGCTGCCG 72 0 

ATGCAGCACC TTTCCAGCGC GCTGTTTTGC AATGGCGGTC GGCGAAAGTT GATGCGCTGT 780 

ATCGTGGTGA AGATCAATCC ATGCTGCGTG AC GAGGC C AC AC TGTGAGTT GGTCAGGGGG 840 

GGCTTACTCG GCGTTTTCCG ACACTGCGTT GGTTGCGGCA GTGCGCACCC CCTGGATTGA 900 

TTGCGGGGGT GCCCTGTCGC TGGTGTCGCC TATCGACTTA GGGGTAAAGG TCGCTCGCGA 960 

AGTTCTGATG CGTGCGTCGC TTGAAC C AC A AATGGTC GAT AGCGTACTCG CAGGCTCTAT 102 0 

GGCTCAAGCA AGCTTTGATG CTTACCTGCT CCCGCGGCAC ATTGGCTTGT ACAGCGGTGT 1080 

TCCCAAGTCG GTTCCGGCCT TGGGGGTGCA GCGCATTTGC GGCACAGGCT TCGAACTGCT 1140 

TCGGCAGGCC GGCGAGCAGA TTTCCCAAGG CGCTGATCAC GTGCTGTGTG TCGCGGGCTG 1200 

CAG 1203 
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Sequence 10 

GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAACTGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 120 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACGAT GAATAGCTAC GATGGCCGTT GGTCTACCGT TGATGTGAAG GTTGAAGAAG 300 

GTATCGCTTG GGTCACGCTG AACCGCCCGG AGAAGCGCAA CGCAATGAGC CCAACTCTCA 3 60 

ATCGAGAGAT GGTCGAGGTT CTGGAGGTGC TGGAGCAGGA CGCAGATGCT CGCGTGCTTG 420 

TTCTGACTGG TGCAGGCGAA TCCTGGACCG CGGGCATGGA CCTGAAGGAG TATTTCCGCG 480 

AGACCGATGC TGGCCCCGAA ATTCTGCAAG AGAAGATTCG TCGGGGACAG CAAGCGAACC 54 0 

GGAATTGCCA GCTGGGGCGC CCTCTGGTAA GGTTGGGAAG CCCTGCAAAG TAAACTGGAT 600 

GGCTTTCTTG CCGCCAAGGA TCTGATGGCG CAGGGGATCA AGATCTGATC AAGAGACAGG 660 

ATGAGGATCG TTTCGCATGA TTGAACAAGA TGGATTGCAC GCAGGTTCTC CGGCCGCTTG 72 0 

GGTGGAGAGG CTATTCGGCT ATGACTGGGC ACAACAGACA ATCGGCTGCT CTGATGCCGC 7 80 

CGTGTTCCGG CTGTCAGCGC AGGGGCGCCC GGTTCTTTTT GTCAAGACCG ACCTGTCCGG 840 

TGCCCTGAAT GAACTGCAGG ACGAGGCAGC GCGGCTATCG TGGCTGGCCA CGACGGGCGT 900 

TCCTTGCGCA GCTGTGCTCG ACGTTGTCAC TGAAGCGGGA AGGGACTGGC TGC TATTGGG 960 

CGAAGTGCCG GGGCAGGATC TCCTGTCATC TCACCTTGCT CC TGC C GAGA AAGT ATC CAT 102 0 

CATGGCTGAT GCAATGCGGC GGCTGCATAC GCTTGATCCG GCTACCTGCC CATTCGACCA 1080 

CCAAGCGAAA CATCGCATCG AGCGAGCACG TAC TCGGATG GAAGCCGGTC TTGTCGATCA 1140 

GGATGATCTG GACGAAGAGC ATCAGGGGCT CGCGCCAGCC GAACTGTTCG CCAGGCTCAA 1200 

GGCGCGCATG CCCGACGGCG AGGATCTCGT CGTGACCCAT GGCGATGCCT GCTTGCCGAA 12 60 

TATCATGGTG GAAAATGGC C GCTTTTCTGG ATTCATCGAC TGTGGCCGGC TGGGTGTGGC 132 0 

GGACCGCTAT CAGGACATAG CGTTGGCTAC CCGTGATATT GCTGAAGAGC TTGGCGGCGA 13 80 

ATGGGCTGAC CGCTTCCTCG TGCTTTACGG TATCGCCGCT CCCGATTCGC AGCGCATCGC 1440 

CTTCTATCGC CTTCTTGACG AGTTCTTCTG AGCGGGACTC TGGGGTTCGA AATGAC CGAC 1500 

CAAGCGACGC CCCGAGCAGG GCATGAAGCA GTTCCTTGAC GAGAAAAGCA TCAAGCCGGG 1560 

CTTGCAGACC TACAAGCGCT GAT AAATGC G CCGGGGCCCT CGCTGCGCCC CCGGCCTTCC 162 0 

AATAATGACA ATAATGAGGA GTGCCCAATG TTTCACGTGC CCCTGCTTAT TGGTGGTAAG 1680 

CCTTGTTCAG CATCTGATGA GCGCACCTTC GAGCGTCGTA GCCCGCTGAC CGGAGAAGTG 1740 

GTATCGCGCG TCGCTGCTGC CAGTTTGGAA GATGCGGACG CCGCAGTGGC C GC TGC AC AG 18 0 0 

GCTGCGTTTC CTGAATGGGC GGCGCTTGCT CCGAGCGAAC GCCGTGCCCG ACTGCTGCGA 1860 

GCGGCGGATC TTC T AG AGGA CCGTTCTTCC GAGTTCACCG CCGCAGCGAG TGAAACTGGC 192 0 

GCAGCGGGAA ACTGGTATGG GTTTAACGTT TACCTGGCGG CGGGCATGTT GCGGGGAATT 1980 

C 1981 
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Seqrience 11 

GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAACTGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 120 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACGAT GAATAGCTAC GATGGCCGTT GGTCTACCGT TGATGTGAAG GTTGAAGAAG 3 00 

GTATCGCTTG GGTCACGCTG AACCGCCCGG AGAAGCGCAA CGCAATGAGC CCAACTCTCA 3 60 

ATCGAGAGAT GGTCGAGGTT CTGGAGGTGC TGGAGCAGGA CGCAGATGCT CGCGTGCTTG 42 0 

TTCTGACTGG TGCAGGCGAA TCCTGGACCG CGGGCATGGA CCTGAAGGAG TATTTCCGCG 480 

AGACCGATGC TGGCCCCGAA ATTCTGCAAG AGAAGATTCG TCGGGGGAGA GGCGGTTTGC 540 

GTATTGGGCG CATGCATAAA AACTGTTGTA ATTCATTAAG CATTCTGCCG ACATGGAAGC 600 

CATCACAAAC GGCATGATGA ACCTGAATCG CCAGCGGCAT CAGCACCTTG TCGCCTTGCG 660 

TATAATATTT GCCCATGGAC GCACACCGTG GAAACGGATG AAGGCACGAA CCCAGTTGAC 720 

ATAAGCCTGT TCGGTTCGTA AACTGTAATG CAAGTAGCGT ATGCGCTCAC GCAACTGGTC 7 80 

CAGAACCTTG ACCGAACGCA GCGGTGGTAA CGGCGCAGTG GCGGTTTTCA TGGCTTGTTA 840 

TGACTGTTTT TTTGTACAGT CTATGCCTCG GGCATCCAAG CAGCAAGCGC GTTACGCCGT 900 

GGGTCGATGT TTGATGTTAT GGAGCAGCAA CGATGTTACG CAGCAGCAAC GATGTTACGC 960 

AGCAGGGCAG TCGCCCTAAA ACAAAGTTAG GTGGCTCAAG TATGGGCATC ATTCGCACAT 1020 

GTAGGCTCGG CCCTGACCAA GTCAAATCCA TGCGGGCTGC TCTTGATCTT TTCGGTCGTG 1080 

AGTTC GGAGA CGTAGCCACC TACTCCCAAC ATC AGC CGGA CTCCGATTAC CTCGGGAACT 1140 

TGCTCCGTAG TAAGACATTC ATCGCGCTTG CTGCCTTCGA CCAAGAAGCG GTTGTTGGCG 1200 

CTCTCGCGGC TTACGTTCTG CCCAGGTTTG AGCAGCCGCG TAGTGAGATC TATATCTATG 12 60 

ATCTCGCAGT CTCCGGCGAG CACCGGAGGC AGGGCATTGC CACCGCGCTC ATCAATCTCC 1320 

TCAAGCATGA GGCCAACGCG CTTGGTGCTT ATGTGATCTA CGTGCAAGCA GATTACGGTG 13 80 

ACGATCCCGC AGTGGCTCTC TATACAAAGT TGGGC AT AC G GGAAGAAGTG ATGCACTTTG 1440 

AT ATC G AC C C AAGTACCGCC ACCTAACAAT TCGTTCAAGC CGAGATCGGC TTCCCCGAGC 15 00 

AGGGCATGAA GCAGTTCCTT G AC G AG AAAA GCATCAAGCC GGGCTTGCAG ACCTACAAGC 1560 

GCTGATAAAT GCGCCGGGGC CCTCGCTGCG CCCCCGGCCT TCCAATAATG ACAATAATGA 162 0 

GGAGTGCCCA ATGTTTCACG TGCCCCTGCT TATTGGTGGT AAGCCTTGTT C AGC ATC TG A 1680 

TGAGCGCACC TTCGAGCGTC GTAGCCCGCT GACC GGAGAA GTGGTATCGC GCGTCGCTGC 1740 

TGCCAGTTTG GAAGATGCGG ACGCCGCAGT GGCCGCTGCA CAGGCTGCGT TTC C TGAAT-G 1800 

GGCGGCGCTT GCTCCGAGCG AACGCCGTGC CCGACTGCTG CGAGCGGCGG ATC TTC TAGA 1860 

GGACCGTTCT TCCGAGTTCA CCGCCGCAGC GAGTGAAACT GGCGCAGCGG GAAACTGGTA 1920 

TGGGTTTAAC GTTTACCTGG CGGCGGGCAT GTTGCGGGGA ATTC 1964 
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Sequence 12 

GAATTCCCCT GGCGACGAAA GGGCGGCAGG CCGCATGGCC ACGGCTGGGC GGTAAC TGAT 60 

GCTTGCGTTA ATCGTTAACC GTTTGAAATT CCTTGCCAAA TTTCGGCGAG AGAATCATGC 12 0 

GGGTACGCCT TTCCGTGCGC TTTGATCTGC GCTTCCGTGC CTTGAATCAG AAAAATAGTT 180 

AATTGACAGA ACTATAGGTT CGCAGTAGCT TTTGCTCACC CACCAAATCC ACAGCACTGG 240 

GGTGCACGAT GAATAGCTAC GATGGCCGTT GGTCTACCGT TGATGTGAAG GTTGAAGAAG 3 00 

GTATCGCTTG GGTCACGCTG AACCGCCCGG AGAAGCGCAA C GC AATGAGC CCAACTCTCA 3 60 

ATC GAGAGAT GGTCGAGGTT CTGGAGGTGC TGGAGCAGGA CGCAGATGCT CGCGTGCTTG 42 0 

TTCTGACTGG TGCAGGCGAA TCCTGGACCG CGGGCATGGA CCTGAAGGAG TATTTCCGCG 480 

AGACCGATGC TGGCCCCGAA ATTCTGCAAG AGAAGATTCG TCGCGAGCAG GGCATGAAGC 540 

AGTTCCTTGA CGAGAAAAGC ATCAAGCCGG GCTTGCAGAC CTACAAGCGC TGATAAATGC 60 0 

GCCGGGGCCC TCGCTGCGCC CCCGGCCTTC CAATAATGAC AATAATGAGG AGTGCCCAAT 66 0 

GTTTCACGTG CCCCTGCTTA TTGGTGGTAA GCCTTGTTCA GCATCTGATG AGCGCACCTT 72 0 

CGAGCGTCGT AGCCCGCTGA CCGGAGAAGT GGTATCGCGC GTCGCTGCTG CCAGTTTGGA 780 

AGATGCGGAC GCCGCAGTGG CCGCTGCACA GGCTGCGTTT CCTGAATGGG CGGCGCTTGC 840 

TCCGAGCGAA CGCCGTGCCC GACTGCTGCG AGCGGCGGAT CTTCTAGAGG ACCGTTCTTC 900 

CGAGTTCACC GCCGCAGCGA GTGAAACTGG CGCAGCGGGA AACTGGTATG GGTTTAACGT 960 

TTACCTGGCG GCGGGCATGT TGCGGGGAAT TC 992 
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Sequence 13 

GAATTCCAAT AATGACAATA ATGAGGAGTG CCCAATGTTT CACGTGCCCC TGCTTATTGG 60 

TGGTAAGCCT TGTTCAGCAT CTGATGAGCG CACCTTCGAG CGTCGTAGCC CGCTGACCGG 12 0 

AGAAGTGGTA TCGCGCGTCG CTGCTGCCAG TTTGGAAGAT GCGGACGCCG CAGTGGCCGC 180 

TGCACAGGCT GCGTTTCCTG AATGGGCGGC GCTTGCTCCG AGCGAACGCC GTGCCCGACT 240 

GCTGCGAGCG GCGGATCTTC TAGAGGACCG TTCTTCCGAG TTCACCGCCG CAGCGAGTGA 300 

AACTGGCGCA GCGGGAAACT GGTATGGGTT TAACGTTTAC CTGGCGGCGG GCATGTTGCG 360 

GGAAGCCGCG GCCATGACCA CACAGATTCA GGGCGATGTC ATTCCGTCCA ATGTGCCCGG 420 

TAGCTTTGCC ATGGCGGTTC GACAGCCATG TGGCGTGGTG CTCGGTATTG CGCCTTGGAA 480 

TGCTCCGGTA ATCCTTGGCG TACGGGCTGT TGCGATGCCG TTGGCATGCG GCAATACCGT 540 

GGTGTTGAAA AGCTCTGAGC TGAGTCCCTT TACCCATCGC CTGATTGGTC AGGTGTTGCA 600 

TGATGCTGGT CTGGGGGATG GCGTGGTGAA TGTCATCAGC AATGCCCCGC AAGACGCTCC 660 

TGCGGTGGTG GAGC GACTGA TTGCAAATCC TGCGGTACGT CGAGTGAACT TCACCGGTTC 72 0 

GACCCACGTT GGAC GGATC A TTGGTGAGCT GTCTGCGCGT CATCTGAAGC CTGCTGTGCT 78 0 

GGAATTAGGT GGTAAGGCTC CGTTC TTGGT CTTGGACGAT GCCGACCTCG ATGCGGCGGT 84 0 

CGAAGCGGCG GCCTTTGGTG CCTACTTCAA TCAGGGTCAA ATCTGCATGT CCACTGAGCG 900 

TCTGATTGTG ACAGCAGTCG CAGACGCCTT TGTTGAAAAG CTGGCGAGGA AGGTCGCCAC 960 

ACTGCGTGCT GGCGATCCTA ATGATCCGCA ATCGGTCTTG GGTTCGTTGA TTGATGCCAA 1020 

TGCAGGTCAA CGCATCCAGG TTCTGGTCGA TGATGCGCTC GGGGACAGCA AGCGAACCGG 1080 

AATTGCCAGC TGGGGCGCCC TCTGGTAAGG TTGGGAAGCC CTGCAAAGTA AACTGGATGG 1140 

CTTTCTTGCC GCCAAGGATC TGATGGCGCA GGGGATCAAG ATCTGATCAA GAGACAGGAT 1200 

GAGGATCGTT TCGCATGATT GAACAAGATG GATTGCACGC AGGTTCTCCG GCCGCTTGGG 1260 

TGGAGAGGCT ATTCGGCTAT GACTGGGCAC AACAGACAAT CGGCTGCTCT GATGCCGCCG 132 0 

TGTTCCGGCT GTCAGCGCAG GGGCGCCCGG TTCTTTTTGT CAAGACCGAC CTGTCCGGTG 138 0 

CCCTGAATGA ACTGCAGGAC GAGGCAGCGC GGCTATCGTG GCTGGCCACG ACGGGCGTTC 1440 

CTTGCGCAGC TGTGCTCGAC GTTGTCACTG AAGC GGGAAG GGACTGGCTG CTATTGGGCG 1500 

AAGTGCCGGG GCAGGATCTC CTGTCATCTC ACCTTGCTCC TGCCGAGAAA GTATCCATCA 1560 

TGGCTGATGC AATGCGGCGG CTGCATACGC TTGATCCGGC TACCTGCCCA TTCGACCACC 162 0 

AAGC GAAAC A TCGCATCGAG CGAGCACGTA C TC GGATGGA AGCCGGTCTT GTCGATCAGG 1680 

ATGATCTGGA CGAAGAGCAT CAGGGGCTCG CGCCAGGCGA ACTGTTCGCC AGGCTCAAGG 1740 

CGCGCATGCC CGACGGCGAG GATCTCGTCG TGACCCATGG CGATGCCTGC TTGCCGAATA 1800 

TCATGGTGGA AAATGGCCGC TTTTCTGGAT TCATCGACTG TGGCCGGCTG GGTGTGGCGG 1860 

ACCGCTATCA GGACATAGCG TTGGCTACCC GTGATATTGC TGAAGAGCTT GGCGGCGAAT 192 0 

GGGCTGACCG CTTCCTCGTG CTTTACGGTA TCGCCGCTCC C GATTC GC AG CGCATCGCCT 198 0 

TCTATCGCCT TC TTG AC GAG TTCTTCTGAG CGGGACTCTG GGGTTC GAAA TGACCGACCA 204 0 

AGCGACGCCC GGCCCAGCGC GTCGATTCGG GCATTTGCCA TATCAATGGA CCGACTGTGC 210 0 

ATGACGAGGC TCAGATGCCA TTCGGTGGGG TGAAGTCCAG CGGCTACGGC AGCTTCGGCA 2160 
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GTCGAGCATC GATTGAGCAC TTTACCCAGC TGCGCTGGCT GACCATTCAG AATGGCCCGC 2220 

GGCACTATCC AATCTAAATC GATCTTCGGG CGCCGCGGGC ATCATGCCCG CGGCGCTCGC 2280 

CTCATTTCAA TCTCTAACTT GATAAAAACA GAGCTGTTCT CCGGTCTTGG TGGATCAAGG 2340 

CCAGTCGCGG AGAGTCTCGA AGAGGAGAGT AC AGTGAAC G CCGAGTCCAC ATTGCAACCG 2400 

CAGGCATCAT CATGCTCTGC TCAGCCACGC TACCGCAGTG TGTCGATTGG TCATCCTCCG 2460 

GTTGAGGTTA CGCAAGACGC TGGAGGTATT GTCCGGATGC GTTCTCTCGA GGCGCTTCTT 2520 

CCCTTCCCGG GTGGAATTC ocr^ Q 
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Sequence 14 

GAATTCCAAT AATGACAATA ATGAGGAGTG CCCAATGTTT CACGTGCCCC TGCTTATTGG 60 

TGGTAAGCCT TGTTCAGCAT CTGATGAGCG CACCTTCGAG CGTCGTAGCC CGCTGACCGG 120 

AGAAGTGGTA TCGCGCGTCG CTGCTGCCAG TTTGGAAGAT GCGGACGCCG CAGTGGCCGC 180 

TGCACAGGCT GCGTTTCCTG AATGGGCGGC GCTTGCTCCG AGCGAACGCC GTGCCCGACT 240 

GCTGCGAGCG GCGGATCTTC TAGAGGACCG TTCTTCCGAG TTCACCGCCG CAGCGAGTGA 3 00 

AACTGGCGCA GCGGGAAACT GGTATGGGTT TAACGTTTAC CTGGCGGCGG GCATGTTGCG 3 60 

GGAAGCCGCG GCCATGACCA CACAGATTCA GGGCGATGTC ATTCCGTCCA ATGTGCCCGG 420 

TAGCTTTGCC ATGGC GGTTC GACAGCCATG TGGCGTGGTG CTCGGTATTG CGCCTTGGAA 480 

TGCTCCGGTA ATCCTTGGCG TACGGGCTGT TGCGATGCCG TTGGCATGCG GCAATACCGT 540 

GGTGTTGAAA AGCTCTGAGC TGAGTCCCTT TACCCATCGC CTGATTGGTC AGGTGTTGCA 600 

TGATGCTGGT CTGGGGGATG GCGTGGTGAA TGTCATCAGC AATGCCCCGC AAGACGCTCC 660 

TGCGGTGGTG GAGCGACTGA TTGCAAATCC TGCGGTACGT CGAGTGAACT TCACCGGTTC 72 0 

GACCCACGTT GGAC GGATC A TTGGTGAGCT GTCTGCGCGT CATCTGAAGC CTGCTGTGCT 7 80 

GGAATTAGGT GGTAAGGCTC CGTTCTTGGT CTTGGACGAT GCCGACCTCG ATGCGGCGGT 840 

CGAAGCGGCG GCCTTTGGTG CCTACTTCAA TCAGGGTCAA ATCTGCATGT CCACTGAGCG 900 

TCTGATTGTG ACAGCAGTCG CAGACGCCTT TGTTGAAAAG CTGGCGAGGA AGGTCGCCAC 9 60 

ACTGCGTGCT GGCGATCCTA ATGATCCGCA ATCGGTCTTG GGTTC GTTGA TTGATGCCAA 102 0 

TGCAGGTCAA CGCATCCAGG TGGGGAGAGG CGGTTTGCGT ATTGGGCGCA TGCATAAAAA 1080 

CTGTTGTAAT TCATTAAGCA TTCTGCCGAC ATGGAAGCCA TCACAAACGG CATGATGAAC 1140 

CTGAATCGCC AGCGGCATCA GCACCTTGTC GCCTTGCGTA TAATATTTGC CCATGGACGC 1200 

ACACCGTGGA AACGGATGAA GGCACGAACC CAGTTGACAT AAGCCTGTTC GGTTCGTAAA 1260 

CTGTAATGCA AGTAGCGTAT GCGCTCACGC AACTGGTCCA GAAC CTTGAC CGAACGCAGC 132 0 

GGTGGTAACG GCGCAGTGGC GGTTTTCATG GCTTGTTATG ACTGTTTTTT TGTACAGTCT 1380 

ATGCCTCGGG CATC C AAGC A GCAAGCGCGT TACGCCGTGG GTCGATGTTT GATGTTATGG 1440 

AGCAGCAACG ATGTTACGCA GCAGCAACGA TGTTACGCAG CAGGGCAGTC GCCCTAAAAC 1500 

AAAGTTAGGT GGCTCAAGTA TGGGCATCAT TCGCACATGT AGGCTCGGCC CTGACCAAGT 1560 

CAAATCCATG CGGGCTGCTC TTGATCTTTT CGGTCGTGAG TTCGGAGACG TAGCCACCTA 1620 

CTCCCAACAT CAGCCGGACT CCGATTACCT CGGGAACTTG CTCCGTAGTA AGACATTCAT 1680 

CGCGCTTGCT GCCTTCGACC AAGAAGCGGT TGTTGGCGCT CTCGCGGCTT ACGTTCTGCC 1740 

CAGGTTTGAG CAGCCGCGTA GTGAGATCTA TATCTATGAT CTCGCAGTCT CCGGCGAGCA 18 00 

CCGGAGGCAG GGCATTGCCA CCGCGCTCAT CAATCTCCTC AAGCATGAGG CCAACGCGCT 18 60 

TGGTGCTTAT GTGATCTACG TGCAAGCAGA TTACGGTGAC GATCCCGCAG TGGCTCTCTA 1920 

TACAAAGTTG GGCATACGGG AAGAAGTGAT GCACTTTGAT ATCGACCCAA GTACCGCCAC 198 0 

CTAACAATTC GTTCAAGCCG AGATCGGCTT CCCAATTGGC CCAGCGCGTC GATTCGGGCA 2 040 

TTTGCCATAT CAATGGACCG ACTGTGCATG ACGAGGC TC A GATGCCATTC GGTGGGGTGA 2100 

AGTCCAGCGG CTACGGCAGC TTCGGCAGTC GAGCATCGAT TGAGCACTTT ACCCAGCTGC 216 0 
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GCTGGCTGAC CATTCAGAAT GGCCCGCGGC ACTATCCAAT CTAAATCGAT CTTCGGGCGC 222 0 

CGCGGGCATC ATGCCCGCGG CGCTCGCCTC ATTTCAATCT CTAAC TTGAT AAAAACAGAG 2280 

CTGTTCTCCG GTCTTGGTGG ATCAAGGCCA GTCGCGGAGA GTCTCGAAGA GGAGAGTACA 2340 

GTGAACGCCG AGTCCACATT GCAACCGCAG GCATCATCAT GCTCTGCTCA GCCACGCTAC 2400 

CGCAGTGTGT CGATTGGTCA TCCTCCGGTT GAGGTTACGC AAGACGCTGG AGGTATTGTC 246 0 

CGGATGCGTT CTCTCGAGGC GCTTCTTCCC TTCCCGGGTG GAATTC 2506 
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Sequence 15 

GAATTCCAAT AATGACAATA ATGAGGAGTG CCCAATGTTT CACGTGCCCC TGCTTATTGG 60 
TGGTAAGCCT TGTTCAGCAT CTGATGAGCG CACCTTCGAG CGTCGTAGCC CGCTGACCGG 12 0 
AGAAGTGGTA TCGCGCGTCG CTGCTGCCAG TTTGGAAGAT GCGGACGCCG CAGTGGCCGC 180 
TGCACAGGCT GCGTTTCCTG AATGGGCGGC GCTTGCTCCG AGC GAACGCC GTGCCCGACT 240 
GCTGCGAGCG GCGGATCTTC TAGAGGACCG TTCTTCCGAG TTCACCGCCG CAGCGAGTGA 300 
AACTGGCGCA GCGGGAAACT GGTATGGGTT TAAC GTTTAC CTGGCGGCGG GCATGTTGCG 360 
GGAAGCCGCG GCCATGACCA CACAGATTCA GGGCGATGTC ATTCCGTCCA ATGTGCCCGG 420 
TAGCTTTGCC ATGGCGGTTC GACAGCCATG TGGCGTGGTG CTCGGTATTG CGCCTTGGAA 480 
TGCTCCGGTA ATCCTTGGCG TACGGGCTGT TGCGATGCCG TTGGCATGCG GCAATACCGT 540 
GGTGTTGAAA AGCTCTGAGC TGAGTCCCTT TACCCATCGC CTGATTGGTC AGGTGTTGCA 600 
TGATGCTGGT CTGGGGGATG GCGTGGTGAA TGTCATCAGC AATGCCCCGC AAGACGCTCC 6 60 
TGCGGTGGTG GAGCGACTGA TTGCAAATCC TGCGGTACGT CGAGTGAACT TCACCGGTTC 720 
GACCCACGTT GGACGGATCA TTGGTGAGCT GTCTGCGCGT CATCTGAAGC CTGCTGTGCT 780 
GGAATTAGGT GGTAAGGCTC CGTTCTTGGT CTTGGAC GAT GCCGACCTCG ATGCGGCGGT 840 
CGAAGCGGCG GCCTTTGGTG CCTACTTCAA TCAGGGTCAA ATCTGCATGT CCACTGAGCG 900 
TCTGATTGTG ACAGCAGTCG CAGACGCCTT TGTTGAAAAG CTGGCGAGGA AGGTCGCCAC 960 

ACTGCGTGCT GGCGATCCTA ATGATCCGCA ATCGGTCTTG GGTTC GTTGA TTGATGCCAA 1020 

TGCAGGTCAA CGCATCCAGG TTCTGGTCGA TGATGCGCTC GCAAAAGGCG CGCAATGGAA 1080 

TTGGCCCAGC GCGTCGATTC GGGCATTTGC CATATCAATG GACCGACTGT GC ATGAC GAG 1140 

GCTCAGATGC CATTCGGTGG GGTGAAGTCC AGCGGCTACG GCAGCTTCGG CAGTCGAGCA 1200 

TCGATTGAGC ACTTTACCCA GCTGCGCTGG CTGACCATTC AGAATGGCCC GCGGCACTAT 12 60 

CCAATCTAAA TCGATCTTCG GGCGCCGCGG GCATCATGCC CGCGGCGCTC GCCTCATTTC 132 0 

AATCTCTAAC TTGATAAAAA CAGAGCTGTT CTCCGGTCTT GGTGGATCAA GGCCAGTCGC 1380 

GGAGAGTCTC GAAGAGGAGA GTACAGTGAA CGCCGAGTCC ACATTGCAAC CGCAGGCATC 1440 

ATCATGCTCT GCTCAGCCAC GCTACCGCAG TGTGTCGATT GGTCATCCTC CGGTTGAGGT 1500 

TACGCAAGAC GCTGGAGGTA TTGTCCGGAT GCGTTCTCTC GAGGCGCTTC TTCCCTTCCC 15 60 

GGGTGGAATT C 1571 
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Sequence 16 

GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACTGTGA GTTGGTCAGG GGGGGCTTAC TCGGCGTTTT CCGACACTGC 120 

GTTGGTTGCG GCAGTGCGCA CCCCCTGGAT TGATTGCGGG GGTGCCCTGT CGCTGGTGTC 180 

GCCTATCGAC TTAGGGGTAA AGGTCGCTCG CGAAGTTCTG ATGCGTGCGT CGCTTGAACC 240 

ACAAATGGTC GATAGCGTAC TCGCAGGCTC TATGGCTCAA GCAAGCTTTG ATGCTTACCT 300 

GCTCCCGCGG CACATTGGCT TGTACAGCGG TGTTCCCAAG TCGGTTCCGG CCTTGGGGGT 3 60 

GCAGCGCATT TGCGGCACAG GCTTCGAACT GCTTCGGCAG GCCGGCGAGC AGATTTCCCA 420 

AGGCGCTGAT CACGTGCTGT GTGTCGCGGC AGAGTCCATG TCGCGTAACC CCATCGCGTC 480 

GTATACACAC CGGGGCGGGT TCCGCCTCGG TGCGCCCGTT GAGTTCAAGG ATTTTTTGTG 540 

GGAGGCATTG TTTGATCCTG CTCCAGGACT CGACATGATC GCTACCGCAG AAAACCTGGG 600 

GACAGCAAGC GAACCGGAAT TGCCAGCTGG GGCGCCCTCT GGTAAGGTTG GGAAGCCCTG 660 

CAAAGTAAAC TGGATGGCTT TCTTGCCGCC AAGGATCTGA TGGCGCAGGG GATCAAGATC 720 

TGATCAAGAG ACAGGATGAG GATCGTTTCG CATGATTGAA CAAGATGGAT TGCACGCAGG 780 

TTCTCCGGCC GCTTGGGTGG AGAGGCTATT CGGCTATGAC TGGGCACAAC AGACAATCGG 840 

CTGCTCTGAT GCCGCCGTGT TCCGGCTGTC AGCGCAGGGG CGCCCGGTTC TTTTTGTCAA 900 

GACCGACCTG TCCGGTGCCC TGAATGAACT GCAGGACGAG GCAGCGCGGC TATCGTGGCT 960 

GGCCACGACG GGCGTTCCTT GCGCAGCTGT GCTCGACGTT GTCACTGAAG CGGGAAGGGA 102 0 

CTGGCTGCTA TTGGGCGAAG TGCCGGGGCA GGATCTCCTG TCATCTCACC TTGCTCCTGC 1080 

CGAGAAAGTA TCCATCATGG CTGATGCAAT GCGGCGGCTG CATACGCTTG ATC CGGCTAC 1140 

CTGCCCATTC GACCACCAAG CGAAACATCG CATC GAGC GA GCACGTACTC GGATGGAAGC 1200 

CGGTCTTGTC GATCAGGATG ATCTGGACGA AGAGCATCAG GGGCTCGCGC CAGCCGAACT 12 60 

GTTCGCCAGG CTCAAGGCGC GCATGCCCGA CGGCGAGGAT CTCGTCGTGA CCCATGGCGA 1320 

TGCCTGCTTG CCGAATATCA TGGTGGAAAA TGGCCGCTTT TCTGGATTCA TCGACTGTGG 1380 

CCGGCTGGGT GTGGCGGACC GCTATCAGGA CATAGCGTTG GCTACCCGTG ATATTGCTGA 1440 

AGAGC TTGGC GGCGAATGGG CTGACCGCTT CCTCGTGCTT TACGGTATCG CCGCTCCCGA 1500 

TTCGCAGCGC ATCGCCTTCT ATCGCCTTCT TGACGAGTTC TTCTGAGCGG GACTCTGGGG 1560 

TTCGAAATGA CCGACCAAGC GACGCCCATT GAGGGCGCAA GAGGAGAAAT GGATTGACCA 162 0 

AGAGATCGTG GCTGTTACGG ATGAACAGTT CGATTTAGAG GGCTACAACA GTCGAGCAAT 1680 

TGAACTGCCT CGGAAGGCAA AATTGTTGAT CGTGACAGTC ATCCGCGGCC TAGCAGTCTT 1740 

TGAAGCCCTT TCCCGATTGA AGCCTGTTCA TTCTGGCGGG GTGCAGACTG CGGGCAACAG 1800 

CTGTGCCGTA GTGGACGGCG CCGCGGCGGC TTTGGTGGCT CGAGAGTCGT CTGCGACACA 1860 

GCCGGTCTTG GC TAGGAT AC TGGCTACCTC CGTAGTCGGG ATCGAGCCCG AGCATATGGG 1920 

GCTCGGCCCT GCGCCCGCGA TTCGCCTGCT GCTTGCGCGT AGTGATCTTA GTTTGAGGGA 1980 

TATCGACCTC TTTGAGATAA ACGAGGCGCA GGCCGCCCAA GTTCTAGCGG TACAGCATGA 2 040 

ATTGGGTATT GAGCACTCAA AACTTAATAT TTGGGGCGGG GCCATTGCAC TTGGACACCC 2100 

GCTTGCCGCG ACCGGATTGC GTCTCTGCAT GACCCTCGCT C AC C AATTGC AAGCTAATAA 2160 
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CTTTCGATAT GGAATTGCCT CGGCATGCAT TGGTGGGGGA CAGGGGATGG CGGTTCTTTT 2220 

AGAGAATCCC CACTTCGGTT CGTCCTCTGC ACGAAGTTCG ATGATTAACA GAGTTGACCA 2280 

CTATCCACTG AGCTAACGGG CATCTCCTTT GTTGCTTTGA GGTGGCGCAC GAAGGAGGGC 2340 

TCGAAAATCT CTGCTAAAAA CAAGAAGAAG GAACAGGGAA CATGATTAGT TTCGCTCGTA 2400 

TGGCAGAAAG TTTAGGAGTC CAGGCTAAAC TTGCCCTTGC CTTCGCACTC GTATTATGTG 2460 

TCGGGCTGAT TGTTACCGGC ACGGGTTTCT ACAGTGTACA TACCTTGTCA GGGTTGGTGG 252 0 

GAATTC 252 6 
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Sequence 17 

GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACTGTGA GTTGGTCAGG GGGGGCTTAC TCGGCGTTTT CCGACACTGC 12 0 

GTTGGTTGCG GCAGTGCGCA CCCCCTGGAT TGATTGC GGG GGTGCCCTGT CGCTGGTGTC 180 

GCCTATCGAC TTAGGGGTAA AGGTCGCTCG CGAAGTTCTG ATGCGTGCGT CGCTTGAACC 240 

ACAAATGGTC GATAGCGTAC TCGCAGGCTC TATGGCTCAA GCAAGCTTTG ATGCTTACCT 300 

GCTCCCGCGG CACATTGGCT TGTACAGCGG TGTTCCCAAG TCGGTTCCGG CCTTGGGGGT 3 60 

GCAGCGCATT TGCGGCACAG GCTTCGAACT GCTTCGGCAG GCCGGCGAGC AGATTTCCCA 420 

AGGCGCTGAT CACGTGCTGT GTGTCGCGGC AGAGTCCATG TCGCGTAACC CCATCGCGTC 480 

GTATACACAC CGGGGCGGGT TCCGCCTCGG TGCGCCCGTT GAGTTCAAGG ATTTTTTGTG 540 

GGAGGCATTG TTTGATCCTG CTCCAGGACT CGACATGATC GCTACCGCAG AAAACCTGGG 60 0 

GGAGAGGCGG TTTGCGTATT GGGCGCATGC ATAAAAACTG TTGTAATTCA TTAAGCATTC 660 

TGCCGACATG GAAGCCATCA CAAACGGCAT GATGAACCTG AATCGCCAGC GGCATCAGCA 720 

CCTTGTCGCC TTGC GTATAA TATTTGCCCA TGGACGCACA CCGTGGAAAC GGATGAAGGC 78 0 

ACGAACCCAG TTGACATAAG CCTGTTCGGT TCGTAAACTG TAATGCAAGT AGCGTATGCG 840 

CTCACGCAAC TGGTCCAGAA CCTTGACCGA ACGCAGCGGT GGTAACGGCG CAGTGGCGGT 90 0 

TTTCATGGCT TGTTATGACT GTTTTTTTGT ACAGTCTATG CCTCGGGCAT CCAAGCAGCA 96 0 

AGC GCGTTAC GCCGTGGGTC GATGTTTGAT GTTATGGAGC AGCAACGATG TTACGCAGCA 102 0 

GCAACGATGT TACGCAGCAG GGCAGTCGCC CTAAAACAAA GTTAGGTGGC TCAAGTATGG 108 0 

GCATCATTCG CACATGTAGG CTCGGCCCTG ACCAAGTCAA ATCCATGCGG GCTGCTCTTG 1140 

ATCTTTTCGG TCGTGAGTTC GGAGACGTAG CCACCTACTC CCAACATCAG CCGGACTCCG 1200 

ATTACCTCGG GAACTTGCTC CGTAGTAAGA CATTCATCGC GCTTGCTGCC TTCGACCAAG 1260 

AAGCGGTTGT TGGCGCTCTC GCGGCTTACG TTCTGCCCAG GTTTGAGCAG CCGCGTAGTG 1320 

AGATCTATAT CTATGATCTC GCAGTCTCCG GCGAGCACCG GAGGCAGGGC ATTGCCACCG 1380 

CGCTCATCAA TCTCCTCAAG CATGAGGCCA ACGCGCTTGG TGCTTATGTG ATCTACGTGC 1440 

AAGCAGATTA CGGTGACGAT CCCGCAGTGG CTCTCTATAC AAAGTTGGGC ATACGGGAAG 1500 

AAGTGATGCA CTTTGATATC GACCCAAGTA CCGCCACCTA AC AATTC GTT CAAGCCGAGA 1560 

TCGGCTTCCC ATTGAGGGCG CAAGAGGAGA AATGGATTGA CCAAGAGATC GTGGCTGTTA 1620 

C GGATGAAC A GTTC GATTTA GAGGGCTACA ACAGTCGAGC AATTGAACTG CCTCGGAAGG 1680 

CAAAATTGTT GATC GTGAC A GTCATCCGCG GCCTAGCAGT CTTTGAAGCC CTTTCCCGAT 1740 

TGAAGCCTGT TCATTCTGGC GGGGTGCAGA CTGCGGGCAA CAGCTGTGCC GTAGTGGACG 1800 

GCGCCGCGGC GGCTTTGGTG GCTCGAGAGT CGTCTGCGAC ACAGCCGGTC TTGGC TAGGA 186 0 

TACTGGCTAC CTCCGTAGTC GGGATCGAGC CCGAGCATAT GGGGCTCGGC CCTGCGCCCG 1920 

CGATTCGCCT GCTGCTTGCG CGTAGTGATC TTAGTTTGAG GGATATCGAC CTCTTTGAGA 1980 

TAAACGAGGC GCAGGCCGCC CAAGTTCTAG CGGTACAGCA TGAATTGGGT ATTGAGCACT 2040 

CAAAACTTAA TATTTGGGGC GGGGCCATTG CACTTGGACA CCCGCTTGCC GCGACCGGAT 2100 

TGCGTCTCTG CATGACCCTC GCTCACCAAT TGCAAGCTAA TAACTTTCGA TATGGAATTG 2160 



-74- 



CCTCGGCATG CATTGGTGGG GGACAGGGGA TGGCGGTTCT TTTAGAGAAT CCCCACTTCG 2220 

GTTCGTCCTC TGCACGAAGT TCGATGATTA ACAGAGTTGA CCACTATCCA CTGAGCTAAC 2280 

GGGCATCTCC TTTGTTGCTT TGAGGTGGCG CACGAAGGAG GGCTCGAAAA TCTCTGCTAA 2340 

AAACAAGAAG AAGGAACAGG GAACATGATT AGTTTCGCTC GTATGGCAGA AAGTTTAGGA 2400 

GTCCAGGCTA AACTTGCCCT TGCCTTCGCA CTCGTATTAT GTGTCGGGCT GATTGTTACC 2460 

GGCACGGGTT TCTACAGTGT ACATACCTTG TCAGGGTTGG TGGGAATTC 2509 
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Sequence 18 

GAATTCCGCG GTCGGCGAAA GTTGATGCGC TGTATCGTGG TGAAGATCAA TCCATGCTGC 60 

GTGACGAGGC CACACTGTGA GTTGGTCAGG GGGGGCTTAC TCGGCGTTTT CCGACACTGC 12 0 

GTTGGTTGCG GCAGTGCGCA CCCCCTGGAT TGATTGCGGG GGTGCCCTGT CGCTGGTGTC 180 

GCCTATCGAC TTAGGGGTAA AGGTCGCTCG CGAAGTTCTG ATGCGTGCGT CGCTTGAACC 240 

ACAAATGGTC GATAGCGTAC TCGCAGGCTC TATGGCTCAA GCAAGCTTTG ATGCTTACCT 30 0 

GCTCCCGCGG CACATTGGCT TGTACAGCGG TGTTCCCAAG TCGGTTCCGG CCTTGGGGGT 360 

GCAGCGCATT TGCGGCACAG GCTTCGAACT GCTTCGGCAG GCCGGCGAGC AGATTTCCCA 42 0 

AGGCGCTGAT CACGTGCTGT GTGTCGCGGC AGAGTCCATG TCGCGTAACC CCATCGCGTC 48 0 

GTATACACAC CGGGGCGGGT TCCGCCTCGG TGCGCCCGTT GAGTTCAAGG ATTTTTTGTG 54 0 

GGAGGCATTG TTTGATCCTG CTCCAGGACT CGACATGATC GCTACCGCAG AAAACCTGGC 600 

GCGCATTGAG GGCGCAAGAG GAGAAATGGA TTGAC C AAGA GATCGTGGCT GTTACGGATG 660 

AACAGTTCGA TTTAGAGGGC TACAACAGTC GAGCAATTGA ACTGCCTCGG AAGGCAAAAT 72 0 

TGTTGATCGT GACAGTCATC CGCGGCCTAG CAGTCTTTGA AGCCCTTTCC CGATTGAAGC 780 

CTGTTCATTC TGGCGGGGTG CAGACTGCGG GCAACAGCTG TGCCGTAGTG GACGGCGCCG 840 

CGGCGGCTTT GGTGGCTCGA GAGTCGTCTG CGACACAGCC GGTCTTGGCT AGGAT AC TGG 900 

CTACCTCCGT AGTCGGGATC GAGCCCGAGC ATATGGGGCT CGGCCCTGCG CCCGCGATTC 96 0 

GCCTGCTGCT TGCGCGTAGT GATCTTAGTT TGAGGGATAT CGACCTCTTT GAGATAAACG 102 0 

AGGCGCAGGC CGCCCAAGTT CTAGCGGTAC AGCATGAATT GGGTATTGAG CACTCAAAAC 1080 

TTAATATTTG GGGCGGGGCC ATTGCACTTG GACACCCGCT TGCCGCGACC GGATTGCGTC 1140 

TCTGCATGAC CCTCGCTCAC CAATTGCAAG CTAATAACTT TCGATATGGA ATTGCCTCGG 1200 

CATGCATTGG TGGGGGACAG GGGATGGCGG TTCTTTTAGA GAATCCCCAC TTCGGTTCGT 12 60 

CCTCTGCACG AAGTTCGATG ATTAACAGAG TTGAC CAC TA TCCACTGAGC TAACGGGCAT 132 0 

CTCCTTTGTT GCTTTGAGGT GGCGCACGAA GGAGGGCTCG AAAATCTCTG CTAAAAACAA 13 8 0 

GAAGAAGGAA CAGGGAACAT GATTAGTTTC GCTCGTATGG CAGAAAGTTT AGGAGTCCAG 1440 

GCTAAACTTG CCCTTGCCTT CGCACTCGTA TTATGTGTCG GGCTGATTGT TACCGGCACG 1500 

GGTTTCTACA GTGTACATAC CTTGTCAGGG TTGGTGGGAA TTC 1543 
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ca /An Km 
Fig. 1a 

calAQGm 
Fig. 1b 

calAA 
Fig. 1c 

calBQKm 
Fig, id 

ca/BQGm 
Fig. 1e 

calBA 
Fig. 1f 



8MVBal31 SmaV 



Smar BM'f&ai 31 



Psfl 




Psft 



Psfl 




Deletion of 539 bp 



BortlVBal 31 Smal* 



Smal* BgJU7BaJ31 



Eco&l 



^•!QKm- Elements 



£coRl 



SgMVBal 31 Smat* 



Smal* SgrttVBal 31 



5eoRl 



0g/ll78al 31 



EcoRl 




Deletion of 586 bp 
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fcsQKm 
Fig. 1g 

fcsQGrn 
Fig. 1h 

fesA 
Fig. li 

echQKxw 
Fig. 1j 

echQQm 
Fig. 1k 

echA 
Fig. 11 



Psti 



BssHW Smar 



Smar BssHir 



Psfl 



;OKm-Elemeni 



BssHWSmaV 



Sma\' BssHW 




Psft 



Pst\ BssHll Psfl 




Delelion of 1 290 bp 



EcdR\ 



NnA* Sma)' 



Smar NnA" 



HcoRI 



*- "QKm-Etement 



EcoRt 



NnA' Smal" 

V 



Smar NnA' 



HcoRI 



EcoRt WrtX 5C0R! 




Deletion of 463 bp 
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vdhQKm 
Fig. 1m 

vdhnGm 
Fig. 1n 



EcoRl 



BssHW Smar 



Smar BssHW 



Econi 



BssHWSmaV 



Smal' BssHW 



EcoRS 



BssHW 



EcoHl 




Deletion of 210 bp 



EcoRt 



Ecoft) 



aatQKm 
Fig. 1p 



BssHW Smar 



Smar BssHW 



^jf) Km- Element 



EctiR) 



aafQGm 
Fig. 1q 



EccH\ 



Malri 



BssHIPSmaC 
V 



Smal* BssHW 
V 



jQGm-Elemenl 



£coRt 



aaL4 
Fig. 1r 



essHii 




Deletion of 59 bp 
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